Reward Bonuses with Gain Scheduling Inspired by Iterative Deepening
Search
- URL: http://arxiv.org/abs/2212.10765v1
- Date: Wed, 21 Dec 2022 04:52:13 GMT
- Title: Reward Bonuses with Gain Scheduling Inspired by Iterative Deepening
Search
- Authors: Taisuke Kobayashi
- Abstract summary: This paper introduces a novel method of adding intrinsic bonuses to task-oriented reward function in order to efficiently facilitate reinforcement learning search.
Various bonuses have been designed to date, they are analogous to the depth-first and breadth-first search algorithms in graph theory.
A gain scheduling is applied to the designed bonuses, inspired by the iterative deepening search, which is known to inherit the advantages of the two search algorithms.
- Score: 8.071506311915396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a novel method of adding intrinsic bonuses to
task-oriented reward function in order to efficiently facilitate reinforcement
learning search. While various bonuses have been designed to date, they are
analogous to the depth-first and breadth-first search algorithms in graph
theory. This paper, therefore, first designs two bonuses for each of them.
Then, a heuristic gain scheduling is applied to the designed bonuses, inspired
by the iterative deepening search, which is known to inherit the advantages of
the two search algorithms. The proposed method is expected to allow agent to
efficiently reach the best solution in deeper states by gradually exploring
unknown states. In three locomotion tasks with dense rewards and three simple
tasks with sparse rewards, it is shown that the two types of bonuses contribute
to the performance improvement of the different tasks complementarily. In
addition, by combining them with the proposed gain scheduling, all tasks can be
accomplished with high performance.
Related papers
- Sharing Knowledge in Multi-Task Deep Reinforcement Learning [57.38874587065694]
We study the benefit of sharing representations among tasks to enable the effective use of deep neural networks in Multi-Task Reinforcement Learning.
We prove this by providing theoretical guarantees that highlight the conditions for which is convenient to share representations among tasks.
arXiv Detail & Related papers (2024-01-17T19:31:21Z) - A Study of Global and Episodic Bonuses for Exploration in Contextual
MDPs [21.31346761487944]
We show that episodic bonuses are most effective when there is little shared structure across episodes.
We also find that combining the two bonuses can lead to more robust performance across different degrees of shared structure.
This results in an algorithm which sets a new state of the art across 16 tasks from the MiniHack suite used in prior work.
arXiv Detail & Related papers (2023-06-05T20:45:30Z) - Extracting task trees using knowledge retrieval search algorithms in
functional object-oriented network [0.0]
The functional object-oriented network (FOON) has been developed as a knowledge representation method that can be used by robots.
A FOON can be observed as a graph that can provide an ordered plan for robots to retrieve a task tree.
arXiv Detail & Related papers (2022-11-15T17:20:08Z) - On the Expressivity of Markov Reward [89.96685777114456]
This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform.
We frame this study around three new abstract notions of "task" that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories.
arXiv Detail & Related papers (2021-11-01T12:12:16Z) - On Reward-Free RL with Kernel and Neural Function Approximations:
Single-Agent MDP and Markov Game [140.19656665344917]
We study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function.
We tackle this problem under the context of function approximation, leveraging powerful function approximators.
We establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.
arXiv Detail & Related papers (2021-10-19T07:26:33Z) - Joint Learning On The Hierarchy Representation for Fine-Grained Human
Action Recognition [13.088129408377918]
Fine-grained human action recognition is a core research topic in computer vision.
We propose a novel multi-task network which exploits the FineGym hierarchy representation to achieve effective joint learning and prediction.
Our results on the FineGym dataset achieve a new state-of-the-art performance, with 91.80% Top-1 accuracy and 88.46% mean accuracy for element actions.
arXiv Detail & Related papers (2021-10-12T09:37:51Z) - Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene
Text Recognition [60.36540008537054]
In this work, we excavate the implicit task, character counting within the traditional text recognition, without additional labor annotation cost.
We design a two-branch reciprocal feature learning framework in order to adequately utilize the features from both the tasks.
Experiments on 7 benchmarks show the advantages of the proposed methods in both text recognition and the new-built character counting tasks.
arXiv Detail & Related papers (2021-05-13T12:27:35Z) - Sparse Reward Exploration via Novelty Search and Emitters [55.41644538483948]
We introduce the SparsE Reward Exploration via Novelty and Emitters (SERENE) algorithm.
SERENE separates the search space exploration and reward exploitation into two alternating processes.
A meta-scheduler allocates a global computational budget by alternating between the two processes.
arXiv Detail & Related papers (2021-02-05T12:34:54Z) - Exploration in two-stage recommender systems [79.50534282841618]
Two-stage recommender systems are widely adopted in industry due to their scalability and maintainability.
A key challenge of this setup is that optimal performance of each stage in isolation does not imply optimal global performance.
We propose a method of synchronising the exploration strategies between the ranker and the nominators.
arXiv Detail & Related papers (2020-09-01T16:52:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.