Recurrent Off-Policy Deep Reinforcement Learning Doesn't Have to be Slow
- URL: http://arxiv.org/abs/2512.20513v1
- Date: Tue, 23 Dec 2025 17:02:17 GMT
- Title: Recurrent Off-Policy Deep Reinforcement Learning Doesn't Have to be Slow
- Authors: Tyler Clark, Christine Evers, Jonathon Hare,
- Abstract summary: We introduce RISE (Recurrent Integration via Simplified s), a novel approach that can leverage recurrent networks in any image-based off-policy RL setting.<n>We observe a 35.6% human-normalized interquartile mean (IQM) performance improvement across the Atari benchmark.
- Score: 4.951247283741297
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recurrent off-policy deep reinforcement learning models achieve state-of-the-art performance but are often sidelined due to their high computational demands. In response, we introduce RISE (Recurrent Integration via Simplified Encodings), a novel approach that can leverage recurrent networks in any image-based off-policy RL setting without significant computational overheads via using both learnable and non-learnable encoder layers. When integrating RISE into leading non-recurrent off-policy RL algorithms, we observe a 35.6% human-normalized interquartile mean (IQM) performance improvement across the Atari benchmark. We analyze various implementation strategies to highlight the versatility and potential of our proposed framework.
Related papers
- Sample-Efficient Neurosymbolic Deep Reinforcement Learning [49.60927398960061]
We propose a neuro-symbolic Deep RL approach that integrates background symbolic knowledge to improve sample efficiency.<n>Online reasoning is performed to guide the training process through two mechanisms.<n>We show improved performance over a state-of-the-art reward machine baseline.
arXiv Detail & Related papers (2026-01-06T09:28:53Z) - Periodic Asynchrony: An Effective Method for Accelerating Reinforcement Learning [8.395046547177806]
reinforcement learning (RL) has attracted increasing attention, with growing efforts to reproduce and apply it.<n>In mainstream RL frameworks, inference and training are typically deployed on the same devices.<n>In this study, we are returning to the strategy of separating inference and training deployment.<n>We transform the conventional synchronous architecture into a periodically asynchronous framework, which allows for demand-driven, independent, and elastic scaling of each component.
arXiv Detail & Related papers (2025-11-24T08:22:50Z) - Multi-Agent Reinforcement Learning for Sample-Efficient Deep Neural Network Mapping [54.65536245955678]
We present a decentralized multi-agent reinforcement learning (MARL) framework designed to overcome the challenge of sample inefficiency.<n>We introduce an agent clustering algorithm that assigns similar mapping parameters to the same agents based on correlation analysis.<n> Experimental results show our MARL approach improves sample efficiency by 30-300x over standard single-agent RL.
arXiv Detail & Related papers (2025-07-22T05:51:07Z) - Meta-Reinforcement Learning for Fast and Data-Efficient Spectrum Allocation in Dynamic Wireless Networks [1.2940734305933084]
Dynamic allocation of spectrum in 5G / 6G networks is critical to efficient resource utilization.<n>Applying traditional deep reinforcement learning (DRL) is often infeasible due to its immense sample complexity.<n>We propose a meta-learning framework that enables agents to learn a robust initial policy and rapidly adapt to new wireless scenarios.
arXiv Detail & Related papers (2025-07-13T21:29:39Z) - Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z) - Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance [0.0]
Reinforcement learning (RL) suffers from severe sample inefficiency, especially during early training.<n>We propose Dynamic Action Interpolation (DAI), a universal yet straightforward framework that interpolates expert and RL actions.<n>Our theoretical analysis shows that DAI reshapes state visitation distributions to accelerate value function learning.
arXiv Detail & Related papers (2025-04-26T02:12:02Z) - REBEL: Reinforcement Learning via Regressing Relative Rewards [59.68420022466047]
We propose REBEL, a minimalist RL algorithm for the era of generative models.<n>In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL.<n>We find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO.
arXiv Detail & Related papers (2024-04-25T17:20:45Z) - Hyperbolic Deep Reinforcement Learning [8.983647543608226]
We propose a new class of deep reinforcement learning algorithms that model latent representations in hyperbolic space.
We empirically validate our framework by applying it to popular on-policy and off-policy RL algorithms on the Procgen and Atari 100K benchmarks.
arXiv Detail & Related papers (2022-10-04T12:03:04Z) - A Heuristically Assisted Deep Reinforcement Learning Approach for
Network Slice Placement [0.7885276250519428]
We introduce a hybrid placement solution based on Deep Reinforcement Learning (DRL) and a dedicated optimization based on the Power of Two Choices principle.
The proposed Heuristically-Assisted DRL (HA-DRL) allows to accelerate the learning process and gain in resource usage when compared against other state-of-the-art approaches.
arXiv Detail & Related papers (2021-05-14T10:04:17Z) - Phase Retrieval using Expectation Consistent Signal Recovery Algorithm
based on Hypernetwork [73.94896986868146]
Phase retrieval is an important component in modern computational imaging systems.
Recent advances in deep learning have opened up a new possibility for robust and fast PR.
We develop a novel framework for deep unfolding to overcome the existing limitations.
arXiv Detail & Related papers (2021-01-12T08:36:23Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.