SLAM in the Field: An Evaluation of Monocular Mapping and Localization
on Challenging Dynamic Agricultural Environment
- URL: http://arxiv.org/abs/2011.01122v2
- Date: Fri, 6 Nov 2020 15:05:33 GMT
- Title: SLAM in the Field: An Evaluation of Monocular Mapping and Localization
on Challenging Dynamic Agricultural Environment
- Authors: Fangwen Shu, Paul Lesur, Yaxu Xie, Alain Pagani, Didier Stricker
- Abstract summary: This paper demonstrates a system capable of combining a sparse, indirect, monocular visual SLAM, with both offline and real-time Multi-View Stereo (MVS) reconstruction algorithms.
The use of a monocular SLAM makes our system much easier to integrate with an existing device, as we do not rely on a LiDAR.
- Score: 12.666030953871186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper demonstrates a system capable of combining a sparse, indirect,
monocular visual SLAM, with both offline and real-time Multi-View Stereo (MVS)
reconstruction algorithms. This combination overcomes many obstacles
encountered by autonomous vehicles or robots employed in agricultural
environments, such as overly repetitive patterns, need for very detailed
reconstructions, and abrupt movements caused by uneven roads. Furthermore, the
use of a monocular SLAM makes our system much easier to integrate with an
existing device, as we do not rely on a LiDAR (which is expensive and power
consuming), or stereo camera (whose calibration is sensitive to external
perturbation e.g. camera being displaced). To the best of our knowledge, this
paper presents the first evaluation results for monocular SLAM, and our work
further explores unsupervised depth estimation on this specific application
scenario by simulating RGB-D SLAM to tackle the scale ambiguity, and shows our
approach produces reconstructions that are helpful to various agricultural
tasks. Moreover, we highlight that our experiments provide meaningful insight
to improve monocular SLAM systems under agricultural settings.
Related papers
- WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments [48.51530726697405]
We present WildGS-SLAM, a robust and efficient monocular RGB SLAM system designed to handle dynamic environments.
We introduce an uncertainty map, predicted by a shallow multi-layer perceptron and DINOv2 features, to guide dynamic object removal during both tracking and mapping.
Results showcase WildGS-SLAM's superior performance in dynamic environments compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-04T19:19:40Z) - Multi-modal Multi-platform Person Re-Identification: Benchmark and Method [58.59888754340054]
MP-ReID is a novel dataset designed specifically for multi-modality and multi-platform ReID.
This benchmark compiles data from 1,930 identities across diverse modalities, including RGB, infrared, and thermal imaging.
We introduce Uni-Prompt ReID, a framework with specific-designed prompts, tailored for cross-modality and cross-platform scenarios.
arXiv Detail & Related papers (2025-03-21T12:27:49Z) - Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera [49.82535393220003]
Dyn-HaMR is the first approach to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild.
We show that our approach significantly outperforms state-of-the-art methods in terms of 4D global mesh recovery.
This establishes a new benchmark for hand motion reconstruction from monocular video with moving cameras.
arXiv Detail & Related papers (2024-12-17T12:43:10Z) - Simultaneous Map and Object Reconstruction [66.66729715211642]
We present a method for dynamic surface reconstruction of large-scale urban scenes from LiDAR.
We take inspiration from recent novel view synthesis methods and pose the reconstruction problem as a global optimization.
By careful modeling of continuous-time motion, our reconstructions can compensate for the rolling shutter effects of rotating LiDAR sensors.
arXiv Detail & Related papers (2024-06-19T23:53:31Z) - Probing Multimodal LLMs as World Models for Driving [72.18727651074563]
We look at the application of Multimodal Large Language Models (MLLMs) in autonomous driving.
Despite advances in models like GPT-4o, their performance in complex driving environments remains largely unexplored.
arXiv Detail & Related papers (2024-05-09T17:52:42Z) - MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction [2.3630527334737104]
MoD-SLAM is the first monocular NeRF-based dense mapping method that allows 3D reconstruction in real-time in unbounded scenes.
By introducing a robust depth loss term into the tracking process, our SLAM system achieves more precise pose estimation in large-scale scenes.
Our experiments on two standard datasets show that MoD-SLAM achieves competitive performance, improving the accuracy of the 3D reconstruction and localization by up to 30% and 15% respectively.
arXiv Detail & Related papers (2024-02-06T07:07:33Z) - Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras [27.543561055868697]
Photo-SLAM is a novel SLAM framework with a hyper primitives map.
We exploit explicit geometric features for localization and learn implicit photometric features to represent the texture information of the observed environment.
Our proposed system significantly outperforms current state-of-the-art SLAM systems for online photorealistic mapping.
arXiv Detail & Related papers (2023-11-28T12:19:00Z) - Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile
Agents [43.137917788594926]
We propose a tightly-coupled LiDAR-visual SLAM based on geometric features.
The entire line segment detected by the visual subsystem overcomes the limitation of the LiDAR subsystem.
Our system achieves more accurate and robust pose estimation compared to current state-of-the-art multi-modal methods.
arXiv Detail & Related papers (2023-07-15T10:06:43Z) - NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM [111.83168930989503]
NICER-SLAM is a dense RGB SLAM system that simultaneously optimize for camera poses and a hierarchical neural implicit map representation.
We show strong performance in dense mapping, tracking, and novel view synthesis, even competitive with recent RGB-D SLAM systems.
arXiv Detail & Related papers (2023-02-07T17:06:34Z) - SelfTune: Metrically Scaled Monocular Depth Estimation through
Self-Supervised Learning [53.78813049373321]
We propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation.
Our approach is useful for various applications such as mobile robot navigation and is applicable to diverse environments.
arXiv Detail & Related papers (2022-03-10T12:28:42Z) - SGM3D: Stereo Guided Monocular 3D Object Detection [62.11858392862551]
We propose a stereo-guided monocular 3D object detection network, termed SGM3D.
We exploit robust 3D features extracted from stereo images to enhance the features learned from the monocular image.
Our method can be integrated into many other monocular approaches to boost performance without introducing any extra computational cost.
arXiv Detail & Related papers (2021-12-03T13:57:14Z) - Improved Real-Time Monocular SLAM Using Semantic Segmentation on
Selective Frames [15.455647477995312]
monocular simultaneous localization and mapping (SLAM) is emerging in advanced driver assistance systems and autonomous driving.
This paper proposes an improved real-time monocular SLAM using deep learning-based semantic segmentation.
Experiments with six video sequences demonstrate that the proposed monocular SLAM system achieves significantly more accurate trajectory tracking accuracy.
arXiv Detail & Related papers (2021-04-30T22:34:45Z) - BirdSLAM: Monocular Multibody SLAM in Bird's-Eye View [10.250859125675259]
We present BirdSLAM, a novel simultaneous localization and mapping (SLAM) system for autonomous driving platforms equipped with only a monocular camera.
BirdSLAM tackles challenges faced by other monocular SLAM systems by using an orthographic (bird's-eye) view as the configuration space in which localization and mapping are performed.
arXiv Detail & Related papers (2020-11-15T19:37:24Z) - Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction [72.30870535815258]
CNNs for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment.
We propose a joint narrow and wide baseline based self-improving framework, where on the one hand the CNN-predicted depth is leveraged to perform pseudo RGB-D feature-based SLAM.
On the other hand, the bundle-adjusted 3D scene structures and camera poses from the more principled geometric SLAM are injected back into the depth network through novel wide baseline losses.
arXiv Detail & Related papers (2020-04-22T16:31:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.