PoPS: Policy Pruning and Shrinking for Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2001.05012v1
- Date: Tue, 14 Jan 2020 19:28:06 GMT
- Title: PoPS: Policy Pruning and Shrinking for Deep Reinforcement Learning
- Authors: Dor Livne and Kobi Cohen
- Abstract summary: We develop a working algorithm, named Policy Pruning and Shrinking (PoPS), to train DRL models with strong performance.
PoPS is based on a novel iterative policy pruning and shrinking method that leverages the power of transfer learning.
We present an extensive experimental study that demonstrates the strong performance of PoPS using the popular Cartpole, Lunar Lander, Pong, and Pacman environments.
- Score: 16.269923100433232
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent success of deep neural networks (DNNs) for function approximation
in reinforcement learning has triggered the development of Deep Reinforcement
Learning (DRL) algorithms in various fields, such as robotics, computer games,
natural language processing, computer vision, sensing systems, and wireless
networking. Unfortunately, DNNs suffer from high computational cost and memory
consumption, which limits the use of DRL algorithms in systems with limited
hardware resources. In recent years, pruning algorithms have demonstrated
considerable success in reducing the redundancy of DNNs in classification
tasks. However, existing algorithms suffer from a significant performance
reduction in the DRL domain. In this paper, we develop the first effective
solution to the performance reduction problem of pruning in the DRL domain, and
establish a working algorithm, named Policy Pruning and Shrinking (PoPS), to
train DRL models with strong performance while achieving a compact
representation of the DNN. The framework is based on a novel iterative policy
pruning and shrinking method that leverages the power of transfer learning when
training the DRL model. We present an extensive experimental study that
demonstrates the strong performance of PoPS using the popular Cartpole, Lunar
Lander, Pong, and Pacman environments. Finally, we develop an open source
software for the benefit of researchers and developers in related fields.
Related papers
- Broad Critic Deep Actor Reinforcement Learning for Continuous Control [5.440090782797941]
A novel hybrid architecture for actor-critic reinforcement learning (RL) algorithms is introduced.
The proposed architecture integrates the broad learning system (BLS) with deep neural networks (DNNs)
The effectiveness of the proposed algorithm is evaluated by applying it to two classic continuous control tasks.
arXiv Detail & Related papers (2024-11-24T12:24:46Z) - DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - Generative AI for Deep Reinforcement Learning: Framework, Analysis, and Use Cases [60.30995339585003]
Deep reinforcement learning (DRL) has been widely applied across various fields and has achieved remarkable accomplishments.
DRL faces certain limitations, including low sample efficiency and poor generalization.
We present how to leverage generative AI (GAI) to address these issues and enhance the performance of DRL algorithms.
arXiv Detail & Related papers (2024-05-31T01:25:40Z) - Snapshot Reinforcement Learning: Leveraging Prior Trajectories for
Efficiency [6.267119107674013]
Deep reinforcement learning (DRL) algorithms require substantial samples and computational resources to achieve higher performance.
We present the Snapshot Reinforcement Learning framework, which enhances sample efficiency by simply altering environments.
We propose a simple and effective SnapshotRL baseline algorithm, S3RL, which integrates well with existing DRL algorithms.
arXiv Detail & Related papers (2024-03-01T17:05:22Z) - A Review of Deep Reinforcement Learning in Serverless Computing:
Function Scheduling and Resource Auto-Scaling [2.0722667822370386]
This paper presents a comprehensive review of the application of Deep Reinforcement Learning (DRL) techniques in serverless computing.
A systematic review of recent studies applying DRL to serverless computing is presented, covering various algorithms, models, and performances.
Our analysis reveals that DRL, with its ability to learn and adapt from an environment, shows promising results in improving the efficiency of function scheduling and resource scaling.
arXiv Detail & Related papers (2023-10-05T09:26:04Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - A Low Latency Adaptive Coding Spiking Framework for Deep Reinforcement Learning [27.558298367330053]
In this paper, we use learnable matrix multiplication to encode and decode spikes, improving the flexibility of the coders.
We train the SNNs using the direct training method and use two different structures for online and offline RL algorithms.
Experiments have revealed that our method achieves optimal performance with ultra-low latency and excellent energy efficiency.
arXiv Detail & Related papers (2022-11-21T07:26:56Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput.
Until today, no such learning-based algorithms have shown practical potential in this domain.
We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks.
We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Deep Reinforcement Learning with Population-Coded Spiking Neural Network
for Continuous Control [0.0]
We propose a population-coded spiking actor network (PopSAN) trained in conjunction with a deep critic network using deep reinforcement learning (DRL)
We deployed the trained PopSAN on Intel's Loihi neuromorphic chip and benchmarked our method against the mainstream DRL algorithms for continuous control.
Our results support the efficiency of neuromorphic controllers and suggest our hybrid RL as an alternative to deep learning, when both energy-efficiency and robustness are important.
arXiv Detail & Related papers (2020-10-19T16:20:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.