Towards sample-efficient episodic control with DAC-ML
- URL: http://arxiv.org/abs/2012.13779v1
- Date: Sat, 26 Dec 2020 16:38:08 GMT
- Title: Towards sample-efficient episodic control with DAC-ML
- Authors: Ismael T. Freire, Adri\'an F. Amil, Vasiliki Vouloutsi, Paul F.M.J.
Verschure
- Abstract summary: The sample-inefficiency problem in Artificial Intelligence refers to the inability of current Deep Reinforcement Learning models to optimize action policies within a small number of episodes.
Recent studies have tried to overcome this limitation by adding memory systems and architectural biases to improve learning speed.
In this paper, we capitalize on the design principles of the Distributed Adaptive Control (DAC) theory of mind and brain to build a novel cognitive architecture.
- Score: 0.5735035463793007
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The sample-inefficiency problem in Artificial Intelligence refers to the
inability of current Deep Reinforcement Learning models to optimize action
policies within a small number of episodes. Recent studies have tried to
overcome this limitation by adding memory systems and architectural biases to
improve learning speed, such as in Episodic Reinforcement Learning. However,
despite achieving incremental improvements, their performance is still not
comparable to how humans learn behavioral policies. In this paper, we
capitalize on the design principles of the Distributed Adaptive Control (DAC)
theory of mind and brain to build a novel cognitive architecture (DAC-ML) that,
by incorporating a hippocampus-inspired sequential memory system, can rapidly
converge to effective action policies that maximize reward acquisition in a
challenging foraging task.
Related papers
- CODE-CL: COnceptor-Based Gradient Projection for DEep Continual Learning [7.573297026523597]
We introduce COnceptor-based gradient projection for DEep Continual Learning (CODE-CL)
CODE-CL encodes directional importance within the input space of past tasks, allowing new knowledge integration in directions modulated by $1-S$.
We analyze task overlap using conceptor-based representations to identify highly correlated tasks.
arXiv Detail & Related papers (2024-11-21T22:31:06Z) - Incorporating Neuro-Inspired Adaptability for Continual Learning in
Artificial Intelligence [59.11038175596807]
Continual learning aims to empower artificial intelligence with strong adaptability to the real world.
Existing advances mainly focus on preserving memory stability to overcome catastrophic forgetting.
We propose a generic approach that appropriately attenuates old memories in parameter distributions to improve learning plasticity.
arXiv Detail & Related papers (2023-08-29T02:43:58Z) - Improving Performance in Continual Learning Tasks using Bio-Inspired
Architectures [4.2903672492917755]
We develop a biologically inspired lightweight neural network architecture that incorporates synaptic plasticity mechanisms and neuromodulation.
Our approach leads to superior online continual learning performance on Split-MNIST, Split-CIFAR-10, and Split-CIFAR-100 datasets.
We further demonstrate the effectiveness of our approach by integrating key design concepts into other backpropagation-based continual learning algorithms.
arXiv Detail & Related papers (2023-08-08T19:12:52Z) - Assessor-Guided Learning for Continual Environments [17.181933166255448]
This paper proposes an assessor-guided learning strategy for continual learning.
An assessor guides the learning process of a base learner by controlling the direction and pace of the learning process.
The assessor is trained in a meta-learning manner with a meta-objective to boost the learning process of the base learner.
arXiv Detail & Related papers (2023-03-21T06:45:14Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Imitating, Fast and Slow: Robust learning from demonstrations via
decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning.
We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Enhancing reinforcement learning by a finite reward response filter with
a case study in intelligent structural control [0.0]
In many reinforcement learning (RL) problems, it takes some time until a taken action by the agent reaches its maximum effect on the environment.
This paper introduces an applicable enhanced Q-learning method in which at the beginning of the learning phase, the agent takes a single action.
We have applied the developed method to a structural control problem in which the goal of the agent is to reduce the vibrations of a building subjected to earthquake excitations with a specified delay.
arXiv Detail & Related papers (2020-10-25T19:28:35Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z) - Deep RL With Information Constrained Policies: Generalization in
Continuous Control [21.46148507577606]
We show that a natural constraint on information flow might confer onto artificial agents in continuous control tasks.
We implement a novel Capacity-Limited Actor-Critic (CLAC) algorithm.
Our experiments show that compared to alternative approaches, CLAC offers improvements in generalization between training and modified test environments.
arXiv Detail & Related papers (2020-10-09T15:42:21Z) - Discrete Action On-Policy Learning with Action-Value Critic [72.20609919995086]
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension.
We construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation.
These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques.
arXiv Detail & Related papers (2020-02-10T04:23:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.