FrontierNet: Learning Visual Cues to Explore
- URL: http://arxiv.org/abs/2501.04597v1
- Date: Wed, 08 Jan 2025 16:25:32 GMT
- Title: FrontierNet: Learning Visual Cues to Explore
- Authors: Boyang Sun, Hanzhi Chen, Stefan Leutenegger, Cesar Cadena, Marc Pollefeys, Hermann Blum,
- Abstract summary: This work aims at leveraging 2D visual cues for efficient autonomous exploration, addressing the limitations of extracting goal poses from a 3D map.
We propose a image-only frontier-based exploration system, with FrontierNet as a core component developed in this work.
Our approach provides an alternative to existing 3D-dependent exploration systems, achieving a 16% improvement in early-stage exploration efficiency.
- Score: 54.8265603996238
- License:
- Abstract: Exploration of unknown environments is crucial for autonomous robots; it allows them to actively reason and decide on what new data to acquire for tasks such as mapping, object discovery, and environmental assessment. Existing methods, such as frontier-based methods, rely heavily on 3D map operations, which are limited by map quality and often overlook valuable context from visual cues. This work aims at leveraging 2D visual cues for efficient autonomous exploration, addressing the limitations of extracting goal poses from a 3D map. We propose a image-only frontier-based exploration system, with FrontierNet as a core component developed in this work. FrontierNet is a learning-based model that (i) detects frontiers, and (ii) predicts their information gain, from posed RGB images enhanced by monocular depth priors. Our approach provides an alternative to existing 3D-dependent exploration systems, achieving a 16% improvement in early-stage exploration efficiency, as validated through extensive simulations and real-world experiments.
Related papers
- Finding Waldo: Towards Efficient Exploration of NeRF Scene Spaces [4.944459754818577]
We propose and formally define the scene exploration framework as the efficient discovery of NeRF model inputs.
To remedy the lack of approaches addressing scene exploration, we first propose two baseline methods called Guided-Random Search (GRS) and Pose Interpolation-based Search (PIBS)
We then cast scene exploration as an optimization problem, and propose the criteria-agnostic Evolution-Guided Pose Search (EGPS) for exploration.
arXiv Detail & Related papers (2024-03-07T14:08:01Z) - Reinforcement Learning with Frontier-Based Exploration via Autonomous
Environment [0.0]
This research combines an existing Visual-Graph SLAM known as ExploreORB with reinforcement learning.
The proposed algorithm aims to improve the efficiency and accuracy of ExploreORB by optimizing the exploration process of frontiers to build a more accurate map.
arXiv Detail & Related papers (2023-07-14T12:19:46Z) - Language-Guided 3D Object Detection in Point Cloud for Autonomous
Driving [91.91552963872596]
We propose a new multi-modal visual grounding task, termed LiDAR Grounding.
It jointly learns the LiDAR-based object detector with the language features and predicts the targeted region directly from the detector.
Our work offers a deeper insight into the LiDAR-based grounding task and we expect it presents a promising direction for the autonomous driving community.
arXiv Detail & Related papers (2023-05-25T06:22:10Z) - Learning-Augmented Model-Based Planning for Visual Exploration [8.870188183999854]
We propose a novel exploration approach using learning-augmented model-based planning.
Visual sensing and advances in semantic mapping of indoor scenes are exploited.
Our approach surpasses the greedy strategies by 2.1% and the RL-based exploration methods by 8.4% in terms of coverage.
arXiv Detail & Related papers (2022-11-15T04:53:35Z) - Paint and Distill: Boosting 3D Object Detection with Semantic Passing
Network [70.53093934205057]
3D object detection task from lidar or camera sensors is essential for autonomous driving.
We propose a novel semantic passing framework, named SPNet, to boost the performance of existing lidar-based 3D detection models.
arXiv Detail & Related papers (2022-07-12T12:35:34Z) - PillarGrid: Deep Learning-based Cooperative Perception for 3D Object
Detection from Onboard-Roadside LiDAR [15.195933965761645]
We propose textitPillarGrid, a novel cooperative perception method fusing information from multiple 3D LiDARs.
PillarGrid consists of four main phases: 1) cooperative preprocessing of point clouds, 2) pillar-wise voxelization and feature extraction, 3) grid-wise deep fusion of features from multiple sensors, and 4) convolutional neural network (CNN)-based augmented 3D object detection.
Extensive experimentation shows that PillarGrid outperforms the SOTA single-LiDAR-based 3D object detection methods with respect to both accuracy and range by a large margin.
arXiv Detail & Related papers (2022-03-12T02:28:41Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Rapid Exploration for Open-World Navigation with Latent Goal Models [78.45339342966196]
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.
At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images.
We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration.
arXiv Detail & Related papers (2021-04-12T23:14:41Z) - Occupancy Anticipation for Efficient Exploration and Navigation [97.17517060585875]
We propose occupancy anticipation, where the agent uses its egocentric RGB-D observations to infer the occupancy state beyond the visible regions.
By exploiting context in both the egocentric views and top-down maps our model successfully anticipates a broader map of the environment.
Our approach is the winning entry in the 2020 Habitat PointNav Challenge.
arXiv Detail & Related papers (2020-08-21T03:16:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.