Semantic Object-level Modeling for Robust Visual Camera Relocalization
- URL: http://arxiv.org/abs/2402.06951v1
- Date: Sat, 10 Feb 2024 13:39:44 GMT
- Title: Semantic Object-level Modeling for Robust Visual Camera Relocalization
- Authors: Yifan Zhu, Lingjuan Miao, Haitao Wu, Zhiqiang Zhou, Weiyi Chen,
Longwen Wu
- Abstract summary: We propose a novel method of automatic object-level voxel modeling for accurate ellipsoidal representations of objects.
All of these modules are entirely intergrated into visual SLAM system.
- Score: 14.998133272060695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual relocalization is crucial for autonomous visual localization and
navigation of mobile robotics. Due to the improvement of CNN-based object
detection algorithm, the robustness of visual relocalization is greatly
enhanced especially in viewpoints where classical methods fail. However,
ellipsoids (quadrics) generated by axis-aligned object detection may limit the
accuracy of the object-level representation and degenerate the performance of
visual relocalization system. In this paper, we propose a novel method of
automatic object-level voxel modeling for accurate ellipsoidal representations
of objects. As for visual relocalization, we design a better pose optimization
strategy for camera pose recovery, to fully utilize the projection
characteristics of 2D fitted ellipses and the 3D accurate ellipsoids. All of
these modules are entirely intergrated into visual SLAM system. Experimental
results show that our semantic object-level mapping and object-based visual
relocalization methods significantly enhance the performance of visual
relocalization in terms of robustness to new viewpoints.
Related papers
- Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models [79.96917782423219]
Orient Anything is the first expert and foundational model designed to estimate object orientation in a single image.
By developing a pipeline to annotate the front face of 3D objects, we collect 2M images with precise orientation annotations.
Our model achieves state-of-the-art orientation estimation accuracy in both rendered and real images.
arXiv Detail & Related papers (2024-12-24T18:58:43Z) - RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations [55.74675012171316]
RELOCATE is a training-free baseline designed to perform the challenging task of visual query localization in long videos.
To eliminate the need for task-specific training, RELOCATE leverages a region-based representation derived from pretrained vision models.
arXiv Detail & Related papers (2024-12-02T18:59:53Z) - Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - VOOM: Robust Visual Object Odometry and Mapping using Hierarchical
Landmarks [19.789761641342043]
We propose a Visual Object Odometry and Mapping framework VOOM.
We use high-level objects and low-level points as the hierarchical landmarks in a coarse-to-fine manner.
VOOM outperforms both object-oriented SLAM and feature points SLAM systems in terms of localization.
arXiv Detail & Related papers (2024-02-21T08:22:46Z) - LocaliseBot: Multi-view 3D object localisation with differentiable
rendering for robot grasping [9.690844449175948]
We focus on object pose estimation.
Our approach relies on three pieces of information: multiple views of the object, the camera's parameters at those viewpoints, and 3D CAD models of objects.
We show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates.
arXiv Detail & Related papers (2023-11-14T14:27:53Z) - Graphical Object-Centric Actor-Critic [55.2480439325792]
We propose a novel object-centric reinforcement learning algorithm combining actor-critic and model-based approaches.
We use a transformer encoder to extract object representations and graph neural networks to approximate the dynamics of an environment.
Our algorithm performs better in a visually complex 3D robotic environment and a 2D environment with compositional structure than the state-of-the-art model-free actor-critic algorithm.
arXiv Detail & Related papers (2023-10-26T06:05:12Z) - OA-SLAM: Leveraging Objects for Camera Relocalization in Visual SLAM [2.016317500787292]
We show that the major benefit of objects lies in their higher-level semantic and discriminating power.
Our experiments show that the camera can be relocalized from viewpoints where classical methods fail.
Our code and test data are released at gitlab.inria.fr/tangram/oa-slam.
arXiv Detail & Related papers (2022-09-17T14:20:08Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - Category Level Object Pose Estimation via Neural Analysis-by-Synthesis [64.14028598360741]
In this paper we combine a gradient-based fitting procedure with a parametric neural image synthesis module.
The image synthesis network is designed to efficiently span the pose configuration space.
We experimentally show that the method can recover orientation of objects with high accuracy from 2D images alone.
arXiv Detail & Related papers (2020-08-18T20:30:47Z) - OrcVIO: Object residual constrained Visual-Inertial Odometry [18.3130718336919]
This work presents OrcVIO, for visual-inertial odometry tightly coupled with tracking and optimization over structured object models.
The ability of OrcVIO for accurate trajectory estimation and large-scale object-level mapping is evaluated using real data.
arXiv Detail & Related papers (2020-07-29T21:01:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.