A new Potential-Based Reward Shaping for Reinforcement Learning Agent
- URL: http://arxiv.org/abs/1902.06239v3
- Date: Mon, 13 Mar 2023 22:08:07 GMT
- Title: A new Potential-Based Reward Shaping for Reinforcement Learning Agent
- Authors: Babak Badnava, Mona Esmaeili, Nasser Mozayani, and Payman Zarkesh-Ha
- Abstract summary: The proposed method extracts knowledge from episodes' cumulative rewards.
The results indicate an improvement in the learning process in both the single-task and the multi-task reinforcement learner agents.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Potential-based reward shaping (PBRS) is a particular category of machine
learning methods which aims to improve the learning speed of a reinforcement
learning agent by extracting and utilizing extra knowledge while performing a
task. There are two steps in the process of transfer learning: extracting
knowledge from previously learned tasks and transferring that knowledge to use
it in a target task. The latter step is well discussed in the literature with
various methods being proposed for it, while the former has been explored less.
With this in mind, the type of knowledge that is transmitted is very important
and can lead to considerable improvement. Among the literature of both the
transfer learning and the potential-based reward shaping, a subject that has
never been addressed is the knowledge gathered during the learning process
itself. In this paper, we presented a novel potential-based reward shaping
method that attempted to extract knowledge from the learning process. The
proposed method extracts knowledge from episodes' cumulative rewards. The
proposed method has been evaluated in the Arcade learning environment and the
results indicate an improvement in the learning process in both the single-task
and the multi-task reinforcement learner agents.
Related papers
- Exploring CausalWorld: Enhancing robotic manipulation via knowledge transfer and curriculum learning [6.683222869973898]
This study explores a learning-based tri-finger robotic arm manipulating task, which requires complex movements and coordination among the fingers.
By employing reinforcement learning, we train an agent to acquire the necessary skills for proficient manipulation.
Two knowledge transfer strategies, fine-tuning and curriculum learning, were utilized within the soft actor-critic architecture.
arXiv Detail & Related papers (2024-03-25T23:19:19Z) - Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain.
It tackles the problem from two aspects: extracting knowledge and memorizing knowledge.
It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z) - Learning with Recoverable Forgetting [77.56338597012927]
Learning wIth Recoverable Forgetting explicitly handles the task- or sample-specific knowledge removal and recovery.
Specifically, LIRF brings in two innovative schemes, namely knowledge deposit and withdrawal.
We conduct experiments on several datasets, and demonstrate that the proposed LIRF strategy yields encouraging results with gratifying generalization capability.
arXiv Detail & Related papers (2022-07-17T16:42:31Z) - Transferability in Deep Learning: A Survey [80.67296873915176]
The ability to acquire and reuse knowledge is known as transferability in deep learning.
We present this survey to connect different isolated areas in deep learning with their relation to transferability.
We implement a benchmark and an open-source library, enabling a fair evaluation of deep learning methods in terms of transferability.
arXiv Detail & Related papers (2022-01-15T15:03:17Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - A Survey of Exploration Methods in Reinforcement Learning [64.01676570654234]
Reinforcement learning agents depend crucially on exploration to obtain informative data for the learning process.
In this article, we provide a survey of modern exploration methods in (Sequential) reinforcement learning, as well as a taxonomy of exploration methods.
arXiv Detail & Related papers (2021-09-01T02:36:14Z) - KnowRU: Knowledge Reusing via Knowledge Distillation in Multi-agent
Reinforcement Learning [16.167201058368303]
Deep Reinforcement Learning (RL) algorithms have achieved dramatically progress in the multi-agent area.
To alleviate this problem, efficient leveraging of the historical experience is essential.
We propose a method, named "KnowRU" for knowledge reusing.
arXiv Detail & Related papers (2021-03-27T12:38:01Z) - Transfer Learning in Deep Reinforcement Learning: A Survey [64.36174156782333]
Reinforcement learning is a learning paradigm for solving sequential decision-making problems.
Recent years have witnessed remarkable progress in reinforcement learning upon the fast development of deep neural networks.
transfer learning has arisen to tackle various challenges faced by reinforcement learning.
arXiv Detail & Related papers (2020-09-16T18:38:54Z) - Learning Transferable Concepts in Deep Reinforcement Learning [0.7161783472741748]
We show that learning discrete representations of sensory inputs can provide a high-level abstraction that is common across multiple tasks.
In particular, we show that it is possible to learn such representations by self-supervision, following an information theoretic approach.
Our method is able to learn concepts in locomotive and optimal control tasks that increase the sample efficiency in both known and unknown tasks.
arXiv Detail & Related papers (2020-05-16T04:45:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.