Decorrelated Soft Actor-Critic for Efficient Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2501.19133v1
- Date: Fri, 31 Jan 2025 13:38:57 GMT
- Title: Decorrelated Soft Actor-Critic for Efficient Deep Reinforcement Learning
- Authors: Burcu Küçükoğlu, Sander Dalm, Marcel van Gerven,
- Abstract summary: We propose a novel approach to online decorrelation in deep RL based on the decorrelated backpropagation algorithm.
Experiments on the Atari 100k benchmark with DSAC shows, compared to the regular SAC baseline, faster training in five out of the seven games tested.
- Score: 1.2597747768235847
- License:
- Abstract: The effectiveness of credit assignment in reinforcement learning (RL) when dealing with high-dimensional data is influenced by the success of representation learning via deep neural networks, and has implications for the sample efficiency of deep RL algorithms. Input decorrelation has been previously introduced as a method to speed up optimization in neural networks, and has proven impactful in both efficient deep learning and as a method for effective representation learning for deep RL algorithms. We propose a novel approach to online decorrelation in deep RL based on the decorrelated backpropagation algorithm that seamlessly integrates the decorrelation process into the RL training pipeline. Decorrelation matrices are added to each layer, which are updated using a separate decorrelation learning rule that minimizes the total decorrelation loss across all layers, in parallel to minimizing the usual RL loss. We used our approach in combination with the soft actor-critic (SAC) method, which we refer to as decorrelated soft actor-critic (DSAC). Experiments on the Atari 100k benchmark with DSAC shows, compared to the regular SAC baseline, faster training in five out of the seven games tested and improved reward performance in two games with around 50% reduction in wall-clock time, while maintaining performance levels on the other games. These results demonstrate the positive impact of network-wide decorrelation in deep RL for speeding up its sample efficiency through more effective credit assignment.
Related papers
- Adaptive Data Exploitation in Deep Reinforcement Learning [50.53705050673944]
We introduce ADEPT, a powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL)
Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms.
We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-01-22T04:01:17Z) - Broad Critic Deep Actor Reinforcement Learning for Continuous Control [5.440090782797941]
A novel hybrid architecture for actor-critic reinforcement learning (RL) algorithms is introduced.
The proposed architecture integrates the broad learning system (BLS) with deep neural networks (DNNs)
The effectiveness of the proposed algorithm is evaluated by applying it to two classic continuous control tasks.
arXiv Detail & Related papers (2024-11-24T12:24:46Z) - Efficient Deep Learning with Decorrelated Backpropagation [1.9731499060686393]
We show for the first time that much more efficient training of very deep neural networks using decorrelated backpropagation is feasible.
We obtain a more than two-fold speed-up and higher test accuracy compared to backpropagation when training a 18-layer deep residual network.
arXiv Detail & Related papers (2024-05-03T17:21:13Z) - PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation [61.57833648734164]
We propose a novel Parallel Yielding Re-Activation (PYRA) method for training-inference efficient task adaptation.
PYRA outperforms all competing methods under both low compression rate and high compression rate.
arXiv Detail & Related papers (2024-03-14T09:06:49Z) - Deep Learning Meets Adaptive Filtering: A Stein's Unbiased Risk
Estimator Approach [13.887632153924512]
We introduce task-based deep learning frameworks, denoted as Deep RLS and Deep EASI.
These architectures transform the iterations of the original algorithms into layers of a deep neural network, enabling efficient source signal estimation.
To further enhance performance, we propose training these deep unrolled networks utilizing a surrogate loss function grounded on Stein's unbiased risk estimator (SURE)
arXiv Detail & Related papers (2023-07-31T14:26:41Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL)
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions.
Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z) - On the Robustness of Controlled Deep Reinforcement Learning for Slice
Placement [0.8459686722437155]
We compare two Deep Reinforcement Learning algorithms: a pure DRL-based algorithm and a hybrid DRL as a hybrid DRL-heuristic algorithm.
The evaluation results show that the proposed hybrid DRL-heuristic approach is more robust and reliable in case of unpredictable network load changes than pure DRL.
arXiv Detail & Related papers (2021-08-05T10:24:33Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - Large Batch Training Does Not Need Warmup [111.07680619360528]
Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications.
In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training.
Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques.
arXiv Detail & Related papers (2020-02-04T23:03:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.