NeuSE: Neural SE(3)-Equivariant Embedding for Consistent Spatial
Understanding with Objects
- URL: http://arxiv.org/abs/2303.07308v2
- Date: Mon, 10 Jul 2023 12:33:04 GMT
- Title: NeuSE: Neural SE(3)-Equivariant Embedding for Consistent Spatial
Understanding with Objects
- Authors: Jiahui Fu, Yilun Du, Kurran Singh, Joshua B. Tenenbaum, and John J.
Leonard
- Abstract summary: We present NeuSE, a novel Neural SE(3)-Equivariant Embedding for objects.
NeuSE serves as a compact point cloud surrogate for complete object models.
Our proposed SLAM paradigm, using NeuSE for object shape and pose characterization, can operate independently or in conjunction with typical SLAM systems.
- Score: 53.111397800478294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present NeuSE, a novel Neural SE(3)-Equivariant Embedding for objects, and
illustrate how it supports object SLAM for consistent spatial understanding
with long-term scene changes. NeuSE is a set of latent object embeddings
created from partial object observations. It serves as a compact point cloud
surrogate for complete object models, encoding full shape information while
transforming SE(3)-equivariantly in tandem with the object in the physical
world. With NeuSE, relative frame transforms can be directly derived from
inferred latent codes. Our proposed SLAM paradigm, using NeuSE for object shape
and pose characterization, can operate independently or in conjunction with
typical SLAM systems. It directly infers SE(3) camera pose constraints that are
compatible with general SLAM pose graph optimization, while also maintaining a
lightweight object-centric map that adapts to real-world changes. Our approach
is evaluated on synthetic and real-world sequences featuring changed objects
and shows improved localization accuracy and change-aware mapping capability,
when working either standalone or jointly with a common SLAM pipeline.
Related papers
- Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM [12.934788858420752]
Go-SLAM is a novel framework that utilizes 3D Gaussian Splatting SLAM to reconstruct dynamic environments.
Our system facilitates open-vocabulary querying, allowing users to locate objects using natural language descriptions.
arXiv Detail & Related papers (2024-09-25T13:56:08Z) - Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - VOOM: Robust Visual Object Odometry and Mapping using Hierarchical
Landmarks [19.789761641342043]
We propose a Visual Object Odometry and Mapping framework VOOM.
We use high-level objects and low-level points as the hierarchical landmarks in a coarse-to-fine manner.
VOOM outperforms both object-oriented SLAM and feature points SLAM systems in terms of localization.
arXiv Detail & Related papers (2024-02-21T08:22:46Z) - SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation [79.12683101131368]
Category-level object pose estimation, aiming to predict the 6D pose and 3D size of objects from known categories, typically struggles with large intra-class shape variation.
We present SecondPose, a novel approach integrating object-specific geometric features with semantic category priors from DINOv2.
arXiv Detail & Related papers (2023-11-18T17:14:07Z) - 3DS-SLAM: A 3D Object Detection based Semantic SLAM towards Dynamic
Indoor Environments [1.4901625182926226]
We introduce 3DS-SLAM, 3D Semantic SLAM, tailored for dynamic scenes with visual 3D object detection.
The 3DS-SLAM is a tightly-coupled algorithm resolving both semantic and geometric constraints sequentially.
It exhibits an average improvement of 98.01% across the dynamic sequences of the TUM RGB-D dataset.
arXiv Detail & Related papers (2023-10-10T07:48:40Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - TwistSLAM++: Fusing multiple modalities for accurate dynamic semantic
SLAM [0.0]
TwistSLAM++ is a semantic, dynamic, SLAM system that fuses stereo images and LiDAR information.
We show on classical benchmarks that this fusion approach based on multimodal information improves the accuracy of object tracking.
arXiv Detail & Related papers (2022-09-16T12:28:21Z) - Robust Change Detection Based on Neural Descriptor Fields [53.111397800478294]
We develop an object-level online change detection approach that is robust to partially overlapping observations and noisy localization results.
By associating objects via shape code similarity and comparing local object-neighbor spatial layout, our proposed approach demonstrates robustness to low observation overlap and localization noises.
arXiv Detail & Related papers (2022-08-01T17:45:36Z) - SE(3)-Equivariant Attention Networks for Shape Reconstruction in
Function Space [50.14426188851305]
We propose the first SE(3)-equivariant coordinate-based network for learning occupancy fields from point clouds.
In contrast to previous shape reconstruction methods that align the input to a regular grid, we operate directly on the irregular, unoriented point cloud.
We show that our method outperforms previous SO(3)-equivariant methods, as well as non-equivariant methods trained on SO(3)-augmented datasets.
arXiv Detail & Related papers (2022-04-05T17:59:15Z) - DSP-SLAM: Object Oriented SLAM with Deep Shape Priors [16.867669408751507]
We propose an object-oriented SLAM system that builds a rich and accurate joint map of dense 3D models for foreground objects.
DSP-SLAM takes as input the 3D point cloud reconstructed by a feature-based SLAM system.
Our evaluation shows improvements in object pose and shape reconstruction with respect to recent deep prior-based reconstruction methods.
arXiv Detail & Related papers (2021-08-21T10:00:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.