Related papers: KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge

KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge

URL: http://arxiv.org/abs/2002.07418v2
Date: Thu, 21 May 2020 07:02:41 GMT
Title: KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge
Authors: Peng Zhang, Jianye Hao, Weixun Wang, Hongyao Tang, Yi Ma, Yihai Duan, Yan Zheng
Abstract summary: We propose knowledge guided policy network (KoGuN), a novel framework that combines human prior suboptimal knowledge with reinforcement learning. Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to fine-tune suboptimal prior knowledge.
Score: 40.343858932413376
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning agents usually learn from scratch, which requires a large number of interactions with the environment. This is quite different from the learning process of human. When faced with a new task, human naturally have the common sense and use the prior knowledge to derive an initial policy and guide the learning process afterwards. Although the prior knowledge may be not fully applicable to the new task, the learning process is significantly sped up since the initial policy ensures a quick-start of learning and intermediate guidance allows to avoid unnecessary exploration. Taking this inspiration, we propose knowledge guided policy network (KoGuN), a novel framework that combines human prior suboptimal knowledge with reinforcement learning. Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to fine-tune suboptimal prior knowledge. The proposed framework is end-to-end and can be combined with existing policy-based reinforcement learning algorithm. We conduct experiments on both discrete and continuous control tasks. The empirical results show that our approach, which combines human suboptimal knowledge and RL, achieves significant improvement on learning efficiency of flat RL algorithms, even with very low-performance human prior knowledge.

Related papers

Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning [2.5352713493505785]
Reinforcement learning -- algorithms that teach artificial agents to interact with environments by maximising reward signals -- has achieved significant success in recent years. One promising research direction involves introducing goals to allow multimodal policies, commonly through hierarchical or curriculum reinforcement learning. We present a novel probabilistic curriculum learning algorithm to suggest goals for reinforcement learning agents in continuous control and navigation tasks.
arXiv Detail & Related papers (2025-04-02T08:15:16Z)
KnowPC: Knowledge-Driven Programmatic Reinforcement Learning for Zero-shot Coordination [11.203441390685201]
Zero-shot coordination (ZSC) remains a major challenge in the cooperative AI field. We introduce Knowledge-driven Programmatic reinforcement learning for ZSC. A significant challenge is the vast program search space, making it difficult to find high-performing programs efficiently.
arXiv Detail & Related papers (2024-08-08T09:43:54Z)
Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle. We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths. Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z)
Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain. It tackles the problem from two aspects: extracting knowledge and memorizing knowledge. It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z)
Teachable Reinforcement Learning via Advice Distillation [161.43457947665073]
We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher. We show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms.
arXiv Detail & Related papers (2022-03-19T03:22:57Z)
Rethinking Learning Dynamics in RL using Adversarial Networks [79.56118674435844]
We present a learning mechanism for reinforcement learning of closely related skills parameterized via a skill embedding space. The main contribution of our work is to formulate an adversarial training regime for reinforcement learning with the help of entropy-regularized policy gradient formulation.
arXiv Detail & Related papers (2022-01-27T19:51:09Z)
Transferability in Deep Learning: A Survey [80.67296873915176]
The ability to acquire and reuse knowledge is known as transferability in deep learning. We present this survey to connect different isolated areas in deep learning with their relation to transferability. We implement a benchmark and an open-source library, enabling a fair evaluation of deep learning methods in terms of transferability.
arXiv Detail & Related papers (2022-01-15T15:03:17Z)
The Information Geometry of Unsupervised Reinforcement Learning [133.20816939521941]
Unsupervised skill discovery is a class of algorithms that learn a set of policies without access to a reward function. We show that unsupervised skill discovery algorithms do not learn skills that are optimal for every possible reward function.
arXiv Detail & Related papers (2021-10-06T13:08:36Z)
Transferring Domain Knowledge with an Adviser in Continuous Tasks [0.0]
Reinforcement learning techniques are incapable of explicitly incorporating domain-specific knowledge into the learning process. We adapt the Deep Deterministic Policy Gradient (DDPG) algorithm to incorporate an adviser. Our experiments on OpenAi Gym benchmark tasks show that integrating domain knowledge through advisers expedites the learning and improves the policy towards better optima.
arXiv Detail & Related papers (2021-02-16T09:03:33Z)
Learning Transferable Concepts in Deep Reinforcement Learning [0.7161783472741748]
We show that learning discrete representations of sensory inputs can provide a high-level abstraction that is common across multiple tasks. In particular, we show that it is possible to learn such representations by self-supervision, following an information theoretic approach. Our method is able to learn concepts in locomotive and optimal control tasks that increase the sample efficiency in both known and unknown tasks.
arXiv Detail & Related papers (2020-05-16T04:45:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.