Learning to Explore by Reinforcement over High-Level Options
- URL: http://arxiv.org/abs/2111.01364v1
- Date: Tue, 2 Nov 2021 04:21:34 GMT
- Title: Learning to Explore by Reinforcement over High-Level Options
- Authors: Liu Juncheng, McCane Brendan, Mills Steven
- Abstract summary: We propose a new method which grants an agent two intertwined options of behaviors: "look-around" and "frontier navigation"
In each timestep, an agent produces an option and a corresponding action according to the policy.
We demonstrate the effectiveness of the proposed method on two publicly available 3D environment datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Autonomous 3D environment exploration is a fundamental task for various
applications such as navigation. The goal of exploration is to investigate a
new environment and build its occupancy map efficiently. In this paper, we
propose a new method which grants an agent two intertwined options of
behaviors: "look-around" and "frontier navigation". This is implemented by an
option-critic architecture and trained by reinforcement learning algorithms. In
each timestep, an agent produces an option and a corresponding action according
to the policy. We also take advantage of macro-actions by incorporating classic
path-planning techniques to increase training efficiency. We demonstrate the
effectiveness of the proposed method on two publicly available 3D environment
datasets and the results show our method achieves higher coverage than
competing techniques with better efficiency.
Related papers
- Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents [49.85633804913796]
We present an exploration-based trajectory optimization approach, referred to as ETO.
This learning method is designed to enhance the performance of open LLM agents.
Our experiments on three complex tasks demonstrate that ETO consistently surpasses baseline performance by a large margin.
arXiv Detail & Related papers (2024-03-04T21:50:29Z) - Probable Object Location (POLo) Score Estimation for Efficient Object
Goal Navigation [15.623723522165731]
We introduce a novel framework centered around the Probable Object Location (POLo) score.
We further enhance the framework's practicality by introducing POLoNet, a neural network trained to approximate the computationally intensive POLo score.
Our experiments, involving the first phase of the OVMM 2023 challenge, demonstrate that an agent equipped with POLoNet significantly outperforms a range of baseline methods.
arXiv Detail & Related papers (2023-11-14T08:45:32Z) - NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration.
We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments.
Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z) - Ada-NAV: Adaptive Trajectory Length-Based Sample Efficient Policy Learning for Robotic Navigation [72.24964965882783]
Trajectory length plays a pivotal role in the training process of reinforcement learning algorithms.
We introduce Ada-NAV, a novel adaptive trajectory length scheme to enhance the training sample efficiency of RL algorithms.
We demonstrate through simulated and real-world robotic experiments that Ada-NAV outperforms conventional methods.
arXiv Detail & Related papers (2023-06-09T18:45:15Z) - Exploration via Planning for Information about the Optimal Trajectory [67.33886176127578]
We develop a method that allows us to plan for exploration while taking the task and the current knowledge into account.
We demonstrate that our method learns strong policies with 2x fewer samples than strong exploration baselines.
arXiv Detail & Related papers (2022-10-06T20:28:55Z) - Landmark Policy Optimization for Object Navigation Task [77.34726150561087]
This work studies object goal navigation task, which involves navigating to the closest object related to the given semantic category in unseen environments.
Recent works have shown significant achievements both in the end-to-end Reinforcement Learning approach and modular systems, but need a big step forward to be robust and optimal.
We propose a hierarchical method that incorporates standard task formulation and additional area knowledge as landmarks, with a way to extract these landmarks.
arXiv Detail & Related papers (2021-09-17T12:28:46Z) - Deep Reinforcement Learning for Adaptive Exploration of Unknown
Environments [6.90777229452271]
We develop an adaptive exploration approach to trade off between exploration and exploitation in one single step for UAVs.
The proposed approach uses a map segmentation technique to decompose the environment map into smaller, tractable maps.
The results demonstrate that our proposed approach is capable of navigating through randomly generated environments and covering more AoI in less time steps compared to the baselines.
arXiv Detail & Related papers (2021-05-04T16:29:44Z) - Learning Task-Agnostic Action Spaces for Movement Optimization [18.37812596641983]
We propose a novel method for exploring the dynamics of physically based animated characters.
We parameterize actions as target states, and learn a short-horizon goal-conditioned low-level control policy that drives the agent's state towards the targets.
arXiv Detail & Related papers (2020-09-22T06:18:56Z) - Learning Object Relation Graph and Tentative Policy for Visual
Navigation [44.247995617796484]
It is critical to learn informative visual representation and robust navigation policy.
This paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN)
We report 22.8% and 23.5% increase in success rate and Success weighted by Path Length (SPL)
arXiv Detail & Related papers (2020-07-21T18:03:05Z) - Efficient Exploration in Constrained Environments with Goal-Oriented
Reference Path [15.679210057474922]
We train a deep convolutional network that can predict collision-free paths based on a map of the environment.
This is then used by a reinforcement learning algorithm to learn to closely follow the path.
We show that our method consistently improves the sample efficiency and generalization capability to novel environments.
arXiv Detail & Related papers (2020-03-03T17:07:47Z) - Delving into 3D Action Anticipation from Streaming Videos [99.0155538452263]
Action anticipation aims to recognize the action with a partial observation.
We introduce several complementary evaluation metrics and present a basic model based on frame-wise action classification.
We also explore multi-task learning strategies by incorporating auxiliary information from two aspects: the full action representation and the class-agnostic action label.
arXiv Detail & Related papers (2019-06-15T10:30:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.