Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning
- URL: http://arxiv.org/abs/2505.19054v1
- Date: Sun, 25 May 2025 09:17:22 GMT
- Title: Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning
- Authors: Zhuochen Liu, Rahul Jain, Quan Nguyen,
- Abstract summary: Recent advancements in reinforcement learning have leveraged neural networks to achieve state-of-the-art performance across various control tasks.<n>These successes often come at the cost of significant computational resources, as training deep neural networks requires substantial time and data.<n>We introduce an actor-critic algorithm that utilizes randomized neural networks to drastically reduce computational costs while maintaining strong performance.
- Score: 8.559995591255811
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in reinforcement learning (RL) have leveraged neural networks to achieve state-of-the-art performance across various control tasks. However, these successes often come at the cost of significant computational resources, as training deep neural networks requires substantial time and data. In this paper, we introduce an actor-critic algorithm that utilizes randomized neural networks to drastically reduce computational costs while maintaining strong performance. Despite its simple architecture, our method effectively solves a range of control problems, including the locomotion control of a highly dynamic 12-motor quadruped robot, and achieves results comparable to leading algorithms such as Proximal Policy Optimization (PPO). Notably, our approach does not outperform other algorithms in terms of sample efficnency but rather in terms of wall-clock training time. That is, although our algorithm requires more timesteps to converge to an optimal policy, the actual time required for training turns out to be lower.
Related papers
- Research on Edge Computing and Cloud Collaborative Resource Scheduling Optimization Based on Deep Reinforcement Learning [11.657154571216234]
This study addresses the challenge of resource scheduling optimization in edge-cloud collaborative computing using deep reinforcement learning (DRL)<n>The proposed DRL-based approach improves task processing efficiency, reduces overall processing time, enhances resource utilization, and effectively controls task migrations.
arXiv Detail & Related papers (2025-02-26T03:05:11Z) - Learning for Cross-Layer Resource Allocation in MEC-Aided Cell-Free Networks [71.30914500714262]
Cross-layer resource allocation over mobile edge computing (MEC)-aided cell-free networks can sufficiently exploit the transmitting and computing resources to promote the data rate.<n>Joint subcarrier allocation and beamforming optimization are investigated for the MEC-aided cell-free network from the perspective of deep learning.
arXiv Detail & Related papers (2024-12-21T10:18:55Z) - Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler
for Neural Networks [51.71682428015139]
We propose HARL, a reinforcement learning-based auto-scheduler for efficient tensor program exploration.
HarL improves the tensor operator performance by 22% and the search speed by 4.3x compared to the state-of-the-art auto-scheduler.
Inference performance and search speed are also significantly improved on end-to-end neural networks.
arXiv Detail & Related papers (2022-11-21T04:15:27Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Dynamic Sparse Training for Deep Reinforcement Learning [36.66889208433228]
We propose for the first time to dynamically train deep reinforcement learning agents with sparse neural networks from scratch.
Our approach is easy to be integrated into existing deep reinforcement learning algorithms.
We evaluate our approach on OpenAI gym continuous control tasks.
arXiv Detail & Related papers (2021-06-08T09:57:20Z) - A Heuristically Assisted Deep Reinforcement Learning Approach for
Network Slice Placement [0.7885276250519428]
We introduce a hybrid placement solution based on Deep Reinforcement Learning (DRL) and a dedicated optimization based on the Power of Two Choices principle.
The proposed Heuristically-Assisted DRL (HA-DRL) allows to accelerate the learning process and gain in resource usage when compared against other state-of-the-art approaches.
arXiv Detail & Related papers (2021-05-14T10:04:17Z) - A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via
Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy.
Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z) - Deep Reinforcement Learning with Population-Coded Spiking Neural Network
for Continuous Control [0.0]
We propose a population-coded spiking actor network (PopSAN) trained in conjunction with a deep critic network using deep reinforcement learning (DRL)
We deployed the trained PopSAN on Intel's Loihi neuromorphic chip and benchmarked our method against the mainstream DRL algorithms for continuous control.
Our results support the efficiency of neuromorphic controllers and suggest our hybrid RL as an alternative to deep learning, when both energy-efficiency and robustness are important.
arXiv Detail & Related papers (2020-10-19T16:20:45Z) - PoPS: Policy Pruning and Shrinking for Deep Reinforcement Learning [16.269923100433232]
We develop a working algorithm, named Policy Pruning and Shrinking (PoPS), to train DRL models with strong performance.
PoPS is based on a novel iterative policy pruning and shrinking method that leverages the power of transfer learning.
We present an extensive experimental study that demonstrates the strong performance of PoPS using the popular Cartpole, Lunar Lander, Pong, and Pacman environments.
arXiv Detail & Related papers (2020-01-14T19:28:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.