Related papers: ManhattanSLAM: Robust Planar Tracking and Mapping Leveraging Mixture of Manhattan Frames

ManhattanSLAM: Robust Planar Tracking and Mapping Leveraging Mixture of Manhattan Frames

URL: http://arxiv.org/abs/2103.15068v1
Date: Sun, 28 Mar 2021 07:11:57 GMT
Title: ManhattanSLAM: Robust Planar Tracking and Mapping Leveraging Mixture of Manhattan Frames
Authors: Raza Yunus, Yanyan Li and Federico Tombari
Abstract summary: RGB-D SLAM system is proposed to utilize the structural information in indoor scenes, allowing for accurate tracking and efficient dense mapping on a CPU. Planar surfels are directly from sparse planes in our map while non-planar surfels are built by extracting superpixels. We evaluate our method on public benchmarks for pose estimation, drift and reconstruction accuracy, achieving superior performance compared to other state-of-the-art methods.
Score: 41.33367060137042
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, a robust RGB-D SLAM system is proposed to utilize the structural information in indoor scenes, allowing for accurate tracking and efficient dense mapping on a CPU. Prior works have used the Manhattan World (MW) assumption to estimate low-drift camera pose, in turn limiting the applications of such systems. This paper, in contrast, proposes a novel approach delivering robust tracking in MW and non-MW environments. We check orthogonal relations between planes to directly detect Manhattan Frames, modeling the scene as a Mixture of Manhattan Frames. For MW scenes, we decouple pose estimation and provide a novel drift-free rotation estimation based on Manhattan Frame observations. For translation estimation in MW scenes and full camera pose estimation in non-MW scenes, we make use of point, line and plane features for robust tracking in challenging scenes. %mapping Additionally, by exploiting plane features detected in each frame, we also propose an efficient surfel-based dense mapping strategy, which divides each image into planar and non-planar regions. Planar surfels are initialized directly from sparse planes in our map while non-planar surfels are built by extracting superpixels. We evaluate our method on public benchmarks for pose estimation, drift and reconstruction accuracy, achieving superior performance compared to other state-of-the-art methods. We will open-source our code in the future.

Related papers

Plane2Depth: Hierarchical Adaptive Plane Guidance for Monocular Depth Estimation [38.81275292687583]
We propose Plane2Depth, which adaptively utilizes plane information to improve depth prediction within a hierarchical framework. In the proposed plane guided depth generator (PGDG), we design a set of plane queries as prototypes to softly model planes in the scene and predict per-pixel plane coefficients. In the proposed adaptive plane query aggregation (APGA) module, we introduce a novel feature interaction approach to improve the aggregation of multi-scale plane features.
arXiv Detail & Related papers (2024-09-04T07:45:06Z)
Q-SLAM: Quadric Representations for Monocular SLAM [85.82697759049388]
We reimagine volumetric representations through the lens of quadrics. We use quadric assumption to rectify noisy depth estimations from RGB inputs. We introduce a novel quadric-decomposed transformer to aggregate information across quadrics.
arXiv Detail & Related papers (2024-03-12T23:27:30Z)
OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision. We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range. For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z)
Stable Yaw Estimation of Boats from the Viewpoint of UAVs and USVs [14.573513188682183]
We propose a method based on HyperPosePDF for predicting the orientation of boats in the 6D space. We extend HyperPosePDF to work in video-based scenarios, such that it yields robust orientation predictions across time.
arXiv Detail & Related papers (2023-06-24T20:47:37Z)
P$^2$SDF for Neural Indoor Scene Reconstruction [29.355255923026597]
We propose a novel Pseudo Plane-regularized Signed Distance Field (P$2$SDF) for indoor scene reconstruction. Experiments show that our P$2$SDF achieves competitive reconstruction performance in Manhattan scenes.
arXiv Detail & Related papers (2023-03-01T05:07:48Z)
Ground Plane Matters: Picking Up Ground Plane Prior in Monocular 3D Object Detection [92.75961303269548]
The ground plane prior is a very informative geometry clue in monocular 3D object detection (M3OD) We propose a Ground Plane Enhanced Network (GPENet) which resolves both issues at one go. Our GPENet can outperform other methods and achieve state-of-the-art performance, well demonstrating the effectiveness and the superiority of the proposed approach.
arXiv Detail & Related papers (2022-11-03T02:21:35Z)
Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods. The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z)
TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework. For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments. TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z)
PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN [12.251947429149796]
We propose a real-time deep neural architecture that estimates piece-wise planar regions from a single RGB image. Our method achieves significantly higher frame-rates and comparable segmentation accuracy against two-stage methods.
arXiv Detail & Related papers (2021-03-29T08:53:05Z)
Plane Pair Matching for Efficient 3D View Registration [7.920114031312631]
We present a novel method to estimate the motion matrix between overlapping pairs of 3D views in the context of indoor scenes. We use the Manhattan world assumption to introduce lightweight geometric constraints under the form of planes quadri into the problem. We validate our approach on a toy example and present quantitative experiments on a public RGB-D dataset, comparing against recent state-of-the-art methods.
arXiv Detail & Related papers (2020-01-20T11:15:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.