Augmented Q Imitation Learning (AQIL)
- URL: http://arxiv.org/abs/2004.00993v2
- Date: Sun, 5 Apr 2020 17:16:23 GMT
- Title: Augmented Q Imitation Learning (AQIL)
- Authors: Xiao Lei Zhang, Anish Agarwal
- Abstract summary: In imitation learning the machine learns by mimicking the behavior of an expert system whereas in reinforcement learning the machine learns via direct environment feedback.
This paper proposes Augmented Q-Imitation-Learning, a method by which deep reinforcement learning convergence can be accelerated.
- Score: 20.909770125018564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The study of unsupervised learning can be generally divided into two
categories: imitation learning and reinforcement learning. In imitation
learning the machine learns by mimicking the behavior of an expert system
whereas in reinforcement learning the machine learns via direct environment
feedback. Traditional deep reinforcement learning takes a significant time
before the machine starts to converge to an optimal policy. This paper proposes
Augmented Q-Imitation-Learning, a method by which deep reinforcement learning
convergence can be accelerated by applying Q-imitation-learning as the initial
training process in traditional Deep Q-learning.
Related papers
- How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training [92.88889953768455]
Large Language Models (LLMs) face a critical gap in understanding how they internalize new knowledge.
We identify computational subgraphs that facilitate knowledge storage and processing.
arXiv Detail & Related papers (2025-02-16T16:55:43Z) - Online inductive learning from answer sets for efficient reinforcement learning exploration [52.03682298194168]
We exploit inductive learning of answer set programs to learn a set of logical rules representing an explainable approximation of the agent policy.
We then perform answer set reasoning on the learned rules to guide the exploration of the learning agent at the next batch.
Our methodology produces a significant boost in the discounted return achieved by the agent, even in the first batches of training.
arXiv Detail & Related papers (2025-01-13T16:13:22Z) - Normalization and effective learning rates in reinforcement learning [52.59508428613934]
Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature.
We show that normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network parameters and decay in the effective learning rate.
We propose to make the learning rate schedule explicit with a simple re- parameterization which we call Normalize-and-Project.
arXiv Detail & Related papers (2024-07-01T20:58:01Z) - Towards Automated Knowledge Integration From Human-Interpretable Representations [55.2480439325792]
We introduce and motivate theoretically the principles of informed meta-learning enabling automated and controllable inductive bias selection.
We empirically demonstrate the potential benefits and limitations of informed meta-learning in improving data efficiency and generalisation.
arXiv Detail & Related papers (2024-02-25T15:08:37Z) - FRAC-Q-Learning: A Reinforcement Learning with Boredom Avoidance Processes for Social Robots [0.0]
We propose a new reinforcement learning method specialized for the social robot, the FRAC-Q-learning, that can avoid user boredom.
The proposed algorithm consists of a forgetting process in addition to randomizing and categorizing processes.
The FRAC-Q-learning showed significantly higher trend of interest score, and indicated significantly harder to bore users compared to the traditional Q-learning.
arXiv Detail & Related papers (2023-11-26T15:11:17Z) - Active Reinforcement Learning -- A Roadmap Towards Curious Classifier
Systems for Self-Adaptation [0.456877715768796]
Article aims to set up a research agenda towards what we call "active reinforcement learning" in intelligent systems.
Traditional approaches separate the learning problem and make isolated use of techniques from different field of machine learning.
arXiv Detail & Related papers (2022-01-11T13:50:26Z) - The Role of Bio-Inspired Modularity in General Learning [0.0]
One goal of general intelligence is to learn novel information without overwriting prior learning.
bootstrapping previous knowledge may allow for faster learning of a novel task.
modularity may offer a solution to weight-update learning methods that adheres to the learning without catastrophic forgetting and bootstrapping constraints.
arXiv Detail & Related papers (2021-09-23T18:45:34Z) - Transfer Learning in Deep Reinforcement Learning: A Survey [64.36174156782333]
Reinforcement learning is a learning paradigm for solving sequential decision-making problems.
Recent years have witnessed remarkable progress in reinforcement learning upon the fast development of deep neural networks.
transfer learning has arisen to tackle various challenges faced by reinforcement learning.
arXiv Detail & Related papers (2020-09-16T18:38:54Z) - Bridging the Imitation Gap by Adaptive Insubordination [88.35564081175642]
We show that when the teaching agent makes decisions with access to privileged information, this information is marginalized during imitation learning.
We propose 'Adaptive Insubordination' (ADVISOR) to address this gap.
ADVISOR dynamically weights imitation and reward-based reinforcement learning losses during training, enabling on-the-fly switching between imitation and exploration.
arXiv Detail & Related papers (2020-07-23T17:59:57Z) - A Novel Update Mechanism for Q-Networks Based On Extreme Learning
Machines [0.6445605125467573]
Extreme Q-Learning Machine (EQLM) is applied to a reinforcement learning problem in the same manner as gradient based updates.
We compare its performance to a typical Q-Network on the cart-pole task.
We show EQLM has similar long-term learning performance to a Q-Network.
arXiv Detail & Related papers (2020-06-04T16:16:13Z) - A new Potential-Based Reward Shaping for Reinforcement Learning Agent [0.0]
The proposed method extracts knowledge from episodes' cumulative rewards.
The results indicate an improvement in the learning process in both the single-task and the multi-task reinforcement learner agents.
arXiv Detail & Related papers (2019-02-17T10:34:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.