Reinforcement Learning with a Focus on Adjusting Policies to Reach Targets
- URL: http://arxiv.org/abs/2412.17344v1
- Date: Mon, 23 Dec 2024 07:16:47 GMT
- Title: Reinforcement Learning with a Focus on Adjusting Policies to Reach Targets
- Authors: Akane Tsuboya, Yu Kono, Tatsuji Takahashi,
- Abstract summary: We propose a novel deep reinforcement learning method, which prioritizes achieving an aspiration level over maximizing expected return.
The results of the analysis showed two things: our method flexibly adjusts the exploration scope, and it has the potential to enable the agent to adapt to non-stationary environments.
- Score: 0.0
- License:
- Abstract: The objective of a reinforcement learning agent is to discover better actions through exploration. However, typical exploration techniques aim to maximize rewards, often incurring high costs in both exploration and learning processes. We propose a novel deep reinforcement learning method, which prioritizes achieving an aspiration level over maximizing expected return. This method flexibly adjusts the degree of exploration based on the proportion of target achievement. Through experiments on a motion control task and a navigation task, this method achieved returns equal to or greater than other standard methods. The results of the analysis showed two things: our method flexibly adjusts the exploration scope, and it has the potential to enable the agent to adapt to non-stationary environments. These findings indicated that this method may have effectiveness in improving exploration efficiency in practical applications of reinforcement learning.
Related papers
- State-Novelty Guided Action Persistence in Deep Reinforcement Learning [7.05832012052375]
We propose a novel method to dynamically adjust the action persistence based on the current exploration status of the state space.
Our method can be seamlessly integrated into various basic exploration strategies to incorporate temporal persistence.
arXiv Detail & Related papers (2024-09-09T08:34:22Z) - Random Latent Exploration for Deep Reinforcement Learning [71.88709402926415]
This paper introduces a new exploration technique called Random Latent Exploration (RLE)
RLE combines the strengths of bonus-based and noise-based (two popular approaches for effective exploration in deep RL) exploration strategies.
We evaluate it on the challenging Atari and IsaacGym benchmarks and show that RLE exhibits higher overall scores across all the tasks than other approaches.
arXiv Detail & Related papers (2024-07-18T17:55:22Z) - Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents [49.85633804913796]
We present an exploration-based trajectory optimization approach, referred to as ETO.
This learning method is designed to enhance the performance of open LLM agents.
Our experiments on three complex tasks demonstrate that ETO consistently surpasses baseline performance by a large margin.
arXiv Detail & Related papers (2024-03-04T21:50:29Z) - Efficient Reinforcement Learning via Decoupling Exploration and Utilization [6.305976803910899]
Reinforcement Learning (RL) has achieved remarkable success across multiple fields and applications, including gaming, robotics, and autonomous vehicles.
In this work, our aim is to train agent with efficient learning by decoupling exploration and utilization, so that agent can escaping the conundrum of suboptimal Solutions.
The above idea is implemented in the proposed OPARL (Optimistic and Pessimistic Actor Reinforcement Learning) algorithm.
arXiv Detail & Related papers (2023-12-26T09:03:23Z) - On the Importance of Exploration for Generalization in Reinforcement
Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty.
Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z) - Exploration via Planning for Information about the Optimal Trajectory [67.33886176127578]
We develop a method that allows us to plan for exploration while taking the task and the current knowledge into account.
We demonstrate that our method learns strong policies with 2x fewer samples than strong exploration baselines.
arXiv Detail & Related papers (2022-10-06T20:28:55Z) - Deep Intrinsically Motivated Exploration in Continuous Control [0.0]
In continuous systems, exploration is often performed through undirected strategies in which parameters of the networks or selected actions are perturbed by random noise.
We adapt existing theories on animal motivational systems into the reinforcement learning paradigm and introduce a novel directed exploration strategy.
Our framework extends to larger and more diverse state spaces, dramatically improves the baselines, and outperforms the undirected strategies significantly.
arXiv Detail & Related papers (2022-10-01T14:52:16Z) - Learning to Explore by Reinforcement over High-Level Options [0.0]
We propose a new method which grants an agent two intertwined options of behaviors: "look-around" and "frontier navigation"
In each timestep, an agent produces an option and a corresponding action according to the policy.
We demonstrate the effectiveness of the proposed method on two publicly available 3D environment datasets.
arXiv Detail & Related papers (2021-11-02T04:21:34Z) - A Survey of Exploration Methods in Reinforcement Learning [64.01676570654234]
Reinforcement learning agents depend crucially on exploration to obtain informative data for the learning process.
In this article, we provide a survey of modern exploration methods in (Sequential) reinforcement learning, as well as a taxonomy of exploration methods.
arXiv Detail & Related papers (2021-09-01T02:36:14Z) - MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems.
We propose a novel method for computing the normalized maximum likelihood (NML) distribution.
We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.