Related papers: Learning to Explore by Reinforcement over High-Level Options

Learning to Explore by Reinforcement over High-Level Options

URL: http://arxiv.org/abs/2111.01364v1
Date: Tue, 2 Nov 2021 04:21:34 GMT
Title: Learning to Explore by Reinforcement over High-Level Options
Authors: Liu Juncheng, McCane Brendan, Mills Steven
Abstract summary: We propose a new method which grants an agent two intertwined options of behaviors: "look-around" and "frontier navigation" In each timestep, an agent produces an option and a corresponding action according to the policy. We demonstrate the effectiveness of the proposed method on two publicly available 3D environment datasets.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Autonomous 3D environment exploration is a fundamental task for various applications such as navigation. The goal of exploration is to investigate a new environment and build its occupancy map efficiently. In this paper, we propose a new method which grants an agent two intertwined options of behaviors: "look-around" and "frontier navigation". This is implemented by an option-critic architecture and trained by reinforcement learning algorithms. In each timestep, an agent produces an option and a corresponding action according to the policy. We also take advantage of macro-actions by incorporating classic path-planning techniques to increase training efficiency. We demonstrate the effectiveness of the proposed method on two publicly available 3D environment datasets and the results show our method achieves higher coverage than competing techniques with better efficiency.

Related papers

Reinforcement Learning with a Focus on Adjusting Policies to Reach Targets [0.0]
We propose a novel deep reinforcement learning method, which prioritizes achieving an aspiration level over maximizing expected return. The results of the analysis showed two things: our method flexibly adjusts the exploration scope, and it has the potential to enable the agent to adapt to non-stationary environments.
arXiv Detail & Related papers (2024-12-23T07:16:47Z)
Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal Guidance [46.8322564551124]
We propose a novel subgoal guidance learning strategy. We develop a diffusion strategy-based high-level policy to generate reasonable subgoals as waypoints. We evaluate our method on complex robotic navigation and manipulation tasks.
arXiv Detail & Related papers (2024-09-06T02:49:12Z)
Optimizing TD3 for 7-DOF Robotic Arm Grasping: Overcoming Suboptimality with Exploration-Enhanced Contrastive Learning [0.0]
insufficient exploration of the spatial space can result in suboptimal policies when controlling 7-DOF robotic arms. We propose a novel Exploration-Enhanced Contrastive Learning (EECL) module that improves exploration by providing additional rewards for encountering novel states. We evaluate our method on the robosuite panda lift task, demonstrating that it significantly outperforms the baseline TD3 in terms of both efficiency and convergence speed in the tested environment.
arXiv Detail & Related papers (2024-08-26T04:30:59Z)
Efficient Strategy Learning by Decoupling Searching and Pathfinding for Object Navigation [11.816377162334401]
Two-Stage Reward Mechanism (TSRM) for object navigation decouples the searching and pathfinding behaviours in an episode.<n>Also, we propose a pretraining method Depth Enhanced Masked Autoencoders (DE-MAE) that enables agent to determine explored and unexplored areas.<n>In addition, we propose a new metric of Searching Success weighted by Searching Path Length (SSSPL) that assesses agent's searching ability and exploring efficiency.
arXiv Detail & Related papers (2024-06-20T08:35:10Z)
Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents [49.85633804913796]
We present an exploration-based trajectory optimization approach, referred to as ETO. This learning method is designed to enhance the performance of open LLM agents. Our experiments on three complex tasks demonstrate that ETO consistently surpasses baseline performance by a large margin.
arXiv Detail & Related papers (2024-03-04T21:50:29Z)
Probable Object Location (POLo) Score Estimation for Efficient Object Goal Navigation [15.623723522165731]
We introduce a novel framework centered around the Probable Object Location (POLo) score. We further enhance the framework's practicality by introducing POLoNet, a neural network trained to approximate the computationally intensive POLo score. Our experiments, involving the first phase of the OVMM 2023 challenge, demonstrate that an agent equipped with POLoNet significantly outperforms a range of baseline methods.
arXiv Detail & Related papers (2023-11-14T08:45:32Z)
NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration. We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments. Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z)
CCE: Sample Efficient Sparse Reward Policy Learning for Robotic Navigation via Confidence-Controlled Exploration [72.24964965882783]
Confidence-Controlled Exploration (CCE) is designed to enhance the training sample efficiency of reinforcement learning algorithms for sparse reward settings such as robot navigation. CCE is based on a novel relationship we provide between gradient estimation and policy entropy. We demonstrate through simulated and real-world experiments that CCE outperforms conventional methods that employ constant trajectory lengths and entropy regularization.
arXiv Detail & Related papers (2023-06-09T18:45:15Z)
Exploration via Planning for Information about the Optimal Trajectory [67.33886176127578]
We develop a method that allows us to plan for exploration while taking the task and the current knowledge into account. We demonstrate that our method learns strong policies with 2x fewer samples than strong exploration baselines.
arXiv Detail & Related papers (2022-10-06T20:28:55Z)
Landmark Policy Optimization for Object Navigation Task [77.34726150561087]
This work studies object goal navigation task, which involves navigating to the closest object related to the given semantic category in unseen environments. Recent works have shown significant achievements both in the end-to-end Reinforcement Learning approach and modular systems, but need a big step forward to be robust and optimal. We propose a hierarchical method that incorporates standard task formulation and additional area knowledge as landmarks, with a way to extract these landmarks.
arXiv Detail & Related papers (2021-09-17T12:28:46Z)
Learning Task-Agnostic Action Spaces for Movement Optimization [18.37812596641983]
We propose a novel method for exploring the dynamics of physically based animated characters. We parameterize actions as target states, and learn a short-horizon goal-conditioned low-level control policy that drives the agent's state towards the targets.
arXiv Detail & Related papers (2020-09-22T06:18:56Z)
Efficient Exploration in Constrained Environments with Goal-Oriented Reference Path [15.679210057474922]
We train a deep convolutional network that can predict collision-free paths based on a map of the environment. This is then used by a reinforcement learning algorithm to learn to closely follow the path. We show that our method consistently improves the sample efficiency and generalization capability to novel environments.
arXiv Detail & Related papers (2020-03-03T17:07:47Z)
Delving into 3D Action Anticipation from Streaming Videos [99.0155538452263]
Action anticipation aims to recognize the action with a partial observation. We introduce several complementary evaluation metrics and present a basic model based on frame-wise action classification. We also explore multi-task learning strategies by incorporating auxiliary information from two aspects: the full action representation and the class-agnostic action label.
arXiv Detail & Related papers (2019-06-15T10:30:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.