SO-SLAM: Semantic Object SLAM with Scale Proportional and Symmetrical
Texture Constraints
- URL: http://arxiv.org/abs/2109.04884v1
- Date: Fri, 10 Sep 2021 13:55:37 GMT
- Title: SO-SLAM: Semantic Object SLAM with Scale Proportional and Symmetrical
Texture Constraints
- Authors: Ziwei Liao, Yutong Hu, Jiadong Zhang, Xianyu Qi, Xiaoyu Zhang, Wei
Wang
- Abstract summary: This paper proposes a novel monocular Semantic Object SLAM (SO-SLAM) system that addresses the introduction of object spatial constraints.
We have verified the performance of the algorithm on the public datasets and an author-recorded mobile robot dataset.
- Score: 9.694083816665525
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object SLAM introduces the concept of objects into Simultaneous Localization
and Mapping (SLAM) and helps understand indoor scenes for mobile robots and
object-level interactive applications. The state-of-art object SLAM systems
face challenges such as partial observations, occlusions, unobservable
problems, limiting the mapping accuracy and robustness. This paper proposes a
novel monocular Semantic Object SLAM (SO-SLAM) system that addresses the
introduction of object spatial constraints. We explore three representative
spatial constraints, including scale proportional constraint, symmetrical
texture constraint and plane supporting constraint. Based on these semantic
constraints, we propose two new methods - a more robust object initialization
method and an orientation fine optimization method. We have verified the
performance of the algorithm on the public datasets and an author-recorded
mobile robot dataset and achieved a significant improvement on mapping effects.
We will release the code here: https://github.com/XunshanMan/SoSLAM.
Related papers
- IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments [56.85804719947]
We present IAAO, a framework that builds an explicit 3D model for intelligent agents to gain understanding of articulated objects in their environment through interaction.
We first build hierarchical features and label fields for each object state using 3D Gaussian Splatting (3DGS) by distilling mask features and view-consistent labels from multi-view images.
We then perform object- and part-level queries on the 3D Gaussian primitives to identify static and articulated elements, estimating global transformations and local articulation parameters along with affordances.
arXiv Detail & Related papers (2025-04-09T12:36:48Z) - Convex Hull-based Algebraic Constraint for Visual Quadric SLAM [9.855936120653995]
Using quadrics as the object representation has the benefits of both generality closed-form projection between image and world spaces.
Although numerous have been proposed for dualc reconstruction, we found that many of them are imprecise and provide minimal improvements to localization.
We introduce a concise yet more precise convex hull-based constraint for object landmarks.
Experiments on public datasets demonstrate that our approach is applicable to both monocular and RGB-D SLAM.
arXiv Detail & Related papers (2025-03-03T07:30:07Z) - SMORE: Simultaneous Map and Object REconstruction [66.66729715211642]
We present a method for dynamic surface reconstruction of large-scale urban scenes from LiDAR.
We take a holistic perspective and optimize a compositional model of a dynamic scene that decomposes the world into rigidly-moving objects and the background.
arXiv Detail & Related papers (2024-06-19T23:53:31Z) - VOOM: Robust Visual Object Odometry and Mapping using Hierarchical
Landmarks [19.789761641342043]
We propose a Visual Object Odometry and Mapping framework VOOM.
We use high-level objects and low-level points as the hierarchical landmarks in a coarse-to-fine manner.
VOOM outperforms both object-oriented SLAM and feature points SLAM systems in terms of localization.
arXiv Detail & Related papers (2024-02-21T08:22:46Z) - Leveraging Positional Encoding for Robust Multi-Reference-Based Object
6D Pose Estimation [21.900422840817726]
Accurately estimating the pose of an object is a crucial task in computer vision and robotics.
In this paper, we analyze these limitations and propose new strategies to overcome them.
Our experiments on Linemod, Linemod-Occlusion, and YCB-Video datasets demonstrate that our approach outperforms existing methods.
arXiv Detail & Related papers (2024-01-29T16:42:15Z) - UniQuadric: A SLAM Backend for Unknown Rigid Object 3D Tracking and
Light-Weight Modeling [7.626461564400769]
We propose a novel SLAM backend that unifies ego-motion tracking, rigid object motion tracking, and modeling.
Our system showcases the potential application of object perception in complex dynamic scenes.
arXiv Detail & Related papers (2023-09-29T07:50:09Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast
Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation.
The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects.
Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z) - An Object SLAM Framework for Association, Mapping, and High-Level Tasks [12.62957558651032]
We present a comprehensive object SLAM framework that focuses on object-based perception and object-oriented robot tasks.
A range of public datasets and real-world results have been used to evaluate the proposed object SLAM framework for its efficient performance.
arXiv Detail & Related papers (2023-05-12T08:10:14Z) - NeuSE: Neural SE(3)-Equivariant Embedding for Consistent Spatial
Understanding with Objects [53.111397800478294]
We present NeuSE, a novel Neural SE(3)-Equivariant Embedding for objects.
NeuSE serves as a compact point cloud surrogate for complete object models.
Our proposed SLAM paradigm, using NeuSE for object shape and pose characterization, can operate independently or in conjunction with typical SLAM systems.
arXiv Detail & Related papers (2023-03-13T17:30:43Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - Improving Semantic Segmentation via Decoupled Body and Edge Supervision [89.57847958016981]
Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion.
In this paper, a new paradigm for semantic segmentation is proposed.
Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image.
We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
arXiv Detail & Related papers (2020-07-20T12:11:22Z) - Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects.
Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity.
We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.