Related papers: An Online Semantic Mapping System for Extending and Enhancing Visual SLAM

An Online Semantic Mapping System for Extending and Enhancing Visual SLAM

URL: http://arxiv.org/abs/2203.03944v1
Date: Tue, 8 Mar 2022 09:14:37 GMT
Title: An Online Semantic Mapping System for Extending and Enhancing Visual SLAM
Authors: Thorsten Hempel and Ayoub Al-Hamadi
Abstract summary: We present a real-time semantic mapping approach for mobile vision systems with a 2D to 3D object detection pipeline and rapid data association for generated landmarks. Our system reaches real-time capabilities with an average iteration duration of 65ms and is able to improve the pose estimation of a state-of-the-art SLAM by up to 68% on a public dataset.
Score: 2.538209532048867
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We present a real-time semantic mapping approach for mobile vision systems with a 2D to 3D object detection pipeline and rapid data association for generated landmarks. Besides the semantic map enrichment the associated detections are further introduced as semantic constraints into a simultaneous localization and mapping (SLAM) system for pose correction purposes. This way, we are able generate additional meaningful information that allows to achieve higher-level tasks, while simultaneously leveraging the view-invariance of object detections to improve the accuracy and the robustness of the odometry estimation. We propose tracklets of locally associated object observations to handle ambiguous and false predictions and an uncertainty-based greedy association scheme for an accelerated processing time. Our system reaches real-time capabilities with an average iteration duration of 65~ms and is able to improve the pose estimation of a state-of-the-art SLAM by up to 68% on a public dataset. Additionally, we implemented our approach as a modular ROS package that makes it straightforward for integration in arbitrary graph-based SLAM methods.

Related papers

Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline [64.42938561167402]
We propose an online 3D reconstruction method using 3D Gaussian-based SLAM, combined with a feed-forward recurrent prediction module.<n>This approach replaces slow test-time optimization with fast network inference, significantly improving tracking speed.<n>Our method achieves performance on par with the state-of-the-art SplaTAM, while reducing tracking time by more than 90%.
arXiv Detail & Related papers (2025-08-06T16:16:58Z)
Real-Time Fusion of Visual and Chart Data for Enhanced Maritime Vision [2.14769181770878]
We present a novel approach to enhancing marine vision by fusing real-time visual data with chart information.<n>Our system overlays nautical chart data onto live video feeds by accurately matching detected navigational aids, such as buoys, with their corresponding representations in chart data.
arXiv Detail & Related papers (2025-07-18T12:58:11Z)
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction [53.19968902152528]
We present POMATO, a unified framework for dynamic 3D reconstruction by marrying pointmap matching with temporal motion. Specifically, our method learns an explicit matching relationship by mapping RGB pixels from both dynamic and static regions across different views to 3D pointmaps. We show the effectiveness of the proposed pointmap matching and temporal fusion paradigm by demonstrating the remarkable performance across multiple downstream tasks.
arXiv Detail & Related papers (2025-04-08T05:33:13Z)
Improved LiDAR Odometry and Mapping using Deep Semantic Segmentation and Novel Outliers Detection [1.0334138809056097]
We propose a novel framework for real-time LiDAR odometry and mapping based on LOAM architecture for fast moving platforms. Our framework utilizes semantic information produced by a deep learning model to improve point-to-line and point-to-plane matching. We study the effect of improving the matching process on the robustness of LiDAR odometry against high speed motion.
arXiv Detail & Related papers (2024-03-05T16:53:24Z)
Volumetric Semantically Consistent 3D Panoptic Mapping [77.13446499924977]
We introduce an online 2D-to-3D semantic instance mapping algorithm aimed at generating semantic 3D maps suitable for autonomous agents in unstructured environments. It introduces novel ways of integrating semantic prediction confidence during mapping, producing semantic and instance-consistent 3D regions. The proposed method achieves accuracy superior to the state of the art on public large-scale datasets, improving on a number of widely used metrics.
arXiv Detail & Related papers (2023-09-26T08:03:10Z)
3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds. Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z)
Adaptive Local-Component-aware Graph Convolutional Network for One-shot Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition. Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z)
Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in Driving Scenes [82.4186966781934]
We introduce a simple, efficient, and effective two-stage detector, termed as Ret3D. At the core of Ret3D is the utilization of novel intra-frame and inter-frame relation modules. With negligible extra overhead, Ret3D achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-08-18T03:48:58Z)
FD-SLAM: 3-D Reconstruction Using Features and Dense Matching [18.577229381683434]
We propose an RGB-D SLAM system that uses dense frame-to-model odometry to build accurate sub-maps. We incorporate a learning-based loop closure component based on 3-D features which further stabilises map building. The approach can also scale to large scenes where other systems often fail.
arXiv Detail & Related papers (2022-03-25T18:58:46Z)
Enhanced 3D Human Pose Estimation from Videos by using Attention-Based Neural Network with Dilated Convolutions [12.900524511984798]
We show a systematic design for how conventional networks and other forms of constraints can be incorporated into the attention framework. We achieve this by adapting temporal receptive field via a multi-scale structure of dilated convolutions. Our method achieves the state-of-the-art performance and outperforms existing methods by reducing the mean per joint position error to 33.4 mm on Human3.6M dataset.
arXiv Detail & Related papers (2021-03-04T17:26:51Z)
Accurate Visual-Inertial SLAM by Feature Re-identification [4.263022790692934]
We propose an efficient drift-less SLAM method by re-identifying existing features from a spatial-temporal sensitive sub-global map. Our method achieves 67.3% and 87.5% absolute translation error reduction with only a small additional computational cost.
arXiv Detail & Related papers (2021-02-26T12:54:33Z)
Tight-Integration of Feature-Based Relocalization in Monocular Direct Visual Odometry [49.89611704653707]
We propose a framework for integrating map-based relocalization into online visual odometry. We integrate image features into Direct Sparse Odometry (DSO) and rely on feature matching to associate online visual odometry with a previously built map.
arXiv Detail & Related papers (2021-02-01T21:41:05Z)
Compositional Scalable Object SLAM [29.349829139625403]
We present a fast, scalable, and accurate Simultaneous Localization and Mapping (SLAM) system that represents indoor scenes as a graph of objects. We show that a compositional scalable object mapping formulation is amenable to a robust SLAM solution for drift-free large scale indoor reconstruction.
arXiv Detail & Related papers (2020-11-05T04:46:25Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.