Learning to Explore by Reinforcement over High-Level Options
- URL: http://arxiv.org/abs/2111.01364v1
- Date: Tue, 2 Nov 2021 04:21:34 GMT
- Title: Learning to Explore by Reinforcement over High-Level Options
- Authors: Liu Juncheng, McCane Brendan, Mills Steven
- Abstract summary: We propose a new method which grants an agent two intertwined options of behaviors: "look-around" and "frontier navigation"
In each timestep, an agent produces an option and a corresponding action according to the policy.
We demonstrate the effectiveness of the proposed method on two publicly available 3D environment datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Autonomous 3D environment exploration is a fundamental task for various
applications such as navigation. The goal of exploration is to investigate a
new environment and build its occupancy map efficiently. In this paper, we
propose a new method which grants an agent two intertwined options of
behaviors: "look-around" and "frontier navigation". This is implemented by an
option-critic architecture and trained by reinforcement learning algorithms. In
each timestep, an agent produces an option and a corresponding action according
to the policy. We also take advantage of macro-actions by incorporating classic
path-planning techniques to increase training efficiency. We demonstrate the
effectiveness of the proposed method on two publicly available 3D environment
datasets and the results show our method achieves higher coverage than
competing techniques with better efficiency.
Related papers
- Reinforcement Learning with a Focus on Adjusting Policies to Reach Targets [0.0]
We propose a novel deep reinforcement learning method, which prioritizes achieving an aspiration level over maximizing expected return.
The results of the analysis showed two things: our method flexibly adjusts the exploration scope, and it has the potential to enable the agent to adapt to non-stationary environments.
arXiv Detail & Related papers (2024-12-23T07:16:47Z) - Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal Guidance [46.8322564551124]
We propose a novel subgoal guidance learning strategy.
We develop a diffusion strategy-based high-level policy to generate reasonable subgoals as waypoints.
We evaluate our method on complex robotic navigation and manipulation tasks.
arXiv Detail & Related papers (2024-09-06T02:49:12Z) - Optimizing TD3 for 7-DOF Robotic Arm Grasping: Overcoming Suboptimality with Exploration-Enhanced Contrastive Learning [0.0]
insufficient exploration of the spatial space can result in suboptimal policies when controlling 7-DOF robotic arms.
We propose a novel Exploration-Enhanced Contrastive Learning (EECL) module that improves exploration by providing additional rewards for encountering novel states.
We evaluate our method on the robosuite panda lift task, demonstrating that it significantly outperforms the baseline TD3 in terms of both efficiency and convergence speed in the tested environment.
arXiv Detail & Related papers (2024-08-26T04:30:59Z) - Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents [49.85633804913796]
We present an exploration-based trajectory optimization approach, referred to as ETO.
This learning method is designed to enhance the performance of open LLM agents.
Our experiments on three complex tasks demonstrate that ETO consistently surpasses baseline performance by a large margin.
arXiv Detail & Related papers (2024-03-04T21:50:29Z) - Probable Object Location (POLo) Score Estimation for Efficient Object
Goal Navigation [15.623723522165731]
We introduce a novel framework centered around the Probable Object Location (POLo) score.
We further enhance the framework's practicality by introducing POLoNet, a neural network trained to approximate the computationally intensive POLo score.
Our experiments, involving the first phase of the OVMM 2023 challenge, demonstrate that an agent equipped with POLoNet significantly outperforms a range of baseline methods.
arXiv Detail & Related papers (2023-11-14T08:45:32Z) - NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration.
We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments.
Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z) - CCE: Sample Efficient Sparse Reward Policy Learning for Robotic Navigation via Confidence-Controlled Exploration [72.24964965882783]
Confidence-Controlled Exploration (CCE) is designed to enhance the training sample efficiency of reinforcement learning algorithms for sparse reward settings such as robot navigation.
CCE is based on a novel relationship we provide between gradient estimation and policy entropy.
We demonstrate through simulated and real-world experiments that CCE outperforms conventional methods that employ constant trajectory lengths and entropy regularization.
arXiv Detail & Related papers (2023-06-09T18:45:15Z) - Exploration via Planning for Information about the Optimal Trajectory [67.33886176127578]
We develop a method that allows us to plan for exploration while taking the task and the current knowledge into account.
We demonstrate that our method learns strong policies with 2x fewer samples than strong exploration baselines.
arXiv Detail & Related papers (2022-10-06T20:28:55Z) - Landmark Policy Optimization for Object Navigation Task [77.34726150561087]
This work studies object goal navigation task, which involves navigating to the closest object related to the given semantic category in unseen environments.
Recent works have shown significant achievements both in the end-to-end Reinforcement Learning approach and modular systems, but need a big step forward to be robust and optimal.
We propose a hierarchical method that incorporates standard task formulation and additional area knowledge as landmarks, with a way to extract these landmarks.
arXiv Detail & Related papers (2021-09-17T12:28:46Z) - Learning Task-Agnostic Action Spaces for Movement Optimization [18.37812596641983]
We propose a novel method for exploring the dynamics of physically based animated characters.
We parameterize actions as target states, and learn a short-horizon goal-conditioned low-level control policy that drives the agent's state towards the targets.
arXiv Detail & Related papers (2020-09-22T06:18:56Z) - Efficient Exploration in Constrained Environments with Goal-Oriented
Reference Path [15.679210057474922]
We train a deep convolutional network that can predict collision-free paths based on a map of the environment.
This is then used by a reinforcement learning algorithm to learn to closely follow the path.
We show that our method consistently improves the sample efficiency and generalization capability to novel environments.
arXiv Detail & Related papers (2020-03-03T17:07:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.