Entropy Regularized Reinforcement Learning with Cascading Networks
- URL: http://arxiv.org/abs/2210.08503v1
- Date: Sun, 16 Oct 2022 10:28:59 GMT
- Title: Entropy Regularized Reinforcement Learning with Cascading Networks
- Authors: Riccardo Della Vecchia, Alena Shilova, Philippe Preux, Riad Akrour
- Abstract summary: Deep RL uses neural networks as function approximators.
One of the major difficulties of RL is the absence of i.i.d. data.
In this work, we challenge the common practices of the (un)supervised learning community of using a fixed neural architecture.
- Score: 9.973226671536041
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Reinforcement Learning (Deep RL) has had incredible achievements on high
dimensional problems, yet its learning process remains unstable even on the
simplest tasks. Deep RL uses neural networks as function approximators. These
neural models are largely inspired by developments in the (un)supervised
machine learning community. Compared to these learning frameworks, one of the
major difficulties of RL is the absence of i.i.d. data. One way to cope with
this difficulty is to control the rate of change of the policy at every
iteration. In this work, we challenge the common practices of the
(un)supervised learning community of using a fixed neural architecture, by
having a neural model that grows in size at each policy update. This allows a
closed form entropy regularized policy update, which leads to a better control
of the rate of change of the policy at each iteration and help cope with the
non i.i.d. nature of RL. Initial experiments on classical RL benchmarks show
promising results with remarkable convergence on some RL tasks when compared to
other deep RL baselines, while exhibiting limitations on others.
Related papers
- A Neuromorphic Architecture for Reinforcement Learning from Real-Valued
Observations [0.34410212782758043]
Reinforcement Learning (RL) provides a powerful framework for decision-making in complex environments.
This paper presents a novel Spiking Neural Network (SNN) architecture for solving RL problems with real-valued observations.
arXiv Detail & Related papers (2023-07-06T12:33:34Z) - The RL Perceptron: Generalisation Dynamics of Policy Learning in High
Dimensions [14.778024171498208]
Reinforcement learning algorithms have proven transformative in a range of domains.
Much theory of RL has focused on discrete state spaces or worst-case analysis.
We propose a solvable high-dimensional model of RL that can capture a variety of learning protocols.
arXiv Detail & Related papers (2023-06-17T18:16:51Z) - Beyond Tabula Rasa: Reincarnating Reinforcement Learning [37.201451908129386]
Learning tabula rasa, that is without any prior knowledge, is the prevalent workflow in reinforcement learning (RL) research.
We present reincarnating RL as an alternative workflow, where prior computational work is reused or transferred between design iterations of an RL agent.
We find that existing approaches fail in this setting and propose a simple algorithm to address their limitations.
arXiv Detail & Related papers (2022-06-03T15:11:10Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Deep Reinforcement Learning with Spiking Q-learning [51.386945803485084]
spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption.
It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL)
arXiv Detail & Related papers (2022-01-21T16:42:11Z) - Single-Shot Pruning for Offline Reinforcement Learning [47.886329599997474]
Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems.
One way to tackle this problem is to prune neural networks leaving only the necessary parameters.
We close the gap between RL and single-shot pruning techniques and present a general pruning approach to the Offline RL.
arXiv Detail & Related papers (2021-12-31T18:10:02Z) - Catastrophic Interference in Reinforcement Learning: A Solution Based on
Context Division and Knowledge Distillation [8.044847478961882]
We introduce the concept of "context" into single-task reinforcement learning.
We develop a novel scheme, termed as Context Division and Knowledge Distillation driven RL.
Our results show that, with various replay memory capacities, CDaKD can consistently improve the performance of existing RL algorithms.
arXiv Detail & Related papers (2021-09-01T12:02:04Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - Transient Non-Stationarity and Generalisation in Deep Reinforcement
Learning [67.34810824996887]
Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments.
We propose Iterated Relearning (ITER) to improve generalisation of deep RL agents.
arXiv Detail & Related papers (2020-06-10T13:26:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.