Compressing Deep Reinforcement Learning Networks with a Dynamic
Structured Pruning Method for Autonomous Driving
- URL: http://arxiv.org/abs/2402.05146v1
- Date: Wed, 7 Feb 2024 09:00:30 GMT
- Title: Compressing Deep Reinforcement Learning Networks with a Dynamic
Structured Pruning Method for Autonomous Driving
- Authors: Wensheng Su, Zhenni Li, Minrui Xu, Jiawen Kang, Dusit Niyato, Shengli
Xie
- Abstract summary: Deep reinforcement learning (DRL) has shown remarkable success in complex autonomous driving scenarios.
DRL models inevitably bring high memory consumption and computation, which hinders their wide deployment in resource-limited autonomous driving devices.
We introduce a novel dynamic structured pruning approach that gradually removes a DRL model's unimportant neurons during the training stage.
- Score: 63.155562267383864
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning (DRL) has shown remarkable success in complex
autonomous driving scenarios. However, DRL models inevitably bring high memory
consumption and computation, which hinders their wide deployment in
resource-limited autonomous driving devices. Structured Pruning has been
recognized as a useful method to compress and accelerate DRL models, but it is
still challenging to estimate the contribution of a parameter (i.e., neuron) to
DRL models. In this paper, we introduce a novel dynamic structured pruning
approach that gradually removes a DRL model's unimportant neurons during the
training stage. Our method consists of two steps, i.e. training DRL models with
a group sparse regularizer and removing unimportant neurons with a dynamic
pruning threshold. To efficiently train the DRL model with a small number of
important neurons, we employ a neuron-importance group sparse regularizer. In
contrast to conventional regularizers, this regularizer imposes a penalty on
redundant groups of neurons that do not significantly influence the output of
the DRL model. Furthermore, we design a novel structured pruning strategy to
dynamically determine the pruning threshold and gradually remove unimportant
neurons with a binary mask. Therefore, our method can remove not only redundant
groups of neurons of the DRL model but also achieve high and robust
performance. Experimental results show that the proposed method is competitive
with existing DRL pruning methods on discrete control environments (i.e.,
CartPole-v1 and LunarLander-v2) and MuJoCo continuous environments (i.e.,
Hopper-v3 and Walker2D-v3). Specifically, our method effectively compresses
$93\%$ neurons and $96\%$ weights of the DRL model in four challenging DRL
environments with slight accuracy degradation.
Related papers
- The Impact of Quantization and Pruning on Deep Reinforcement Learning Models [1.5252729367921107]
Deep reinforcement learning (DRL) has achieved remarkable success across various domains, such as video games, robotics, and, recently, large language models.
However, the computational costs and memory requirements of DRL models often limit their deployment in resource-constrained environments.
Our study investigates the impact of two prominent compression methods, quantization and pruning on DRL models.
arXiv Detail & Related papers (2024-07-05T18:21:17Z) - A Real-World Quadrupedal Locomotion Benchmark for Offline Reinforcement
Learning [27.00483962026472]
We benchmark 11 offline reinforcement learning algorithms in realistic quadrupedal locomotion dataset.
Experiments show that the best-performing ORL algorithms can achieve competitive performance compared with the model-free RL.
Our proposed benchmark will serve as a development platform for testing and evaluating the performance of ORL algorithms in real-world legged locomotion tasks.
arXiv Detail & Related papers (2023-09-13T13:18:29Z) - Direct Preference Optimization: Your Language Model is Secretly a Reward Model [119.65409513119963]
We introduce a new parameterization of the reward model in RLHF that enables extraction of the corresponding optimal policy in closed form.
The resulting algorithm, which we call Direct Preference Optimization (DPO), is stable, performant, and computationally lightweight.
Our experiments show that DPO can fine-tune LMs to align with human preferences as well as or better than existing methods.
arXiv Detail & Related papers (2023-05-29T17:57:46Z) - Turbulence control in plane Couette flow using low-dimensional neural
ODE-based models and deep reinforcement learning [0.0]
"DManD-RL" (data-driven manifold dynamics-RL) generates a data-driven low-dimensional model of our system.
We train an RL control agent, yielding a 440-fold speedup over training on a numerical simulation.
The agent learns a policy that laminarizes 84% of unseen DNS test trajectories within 900 time units.
arXiv Detail & Related papers (2023-01-28T05:47:10Z) - RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch [23.104546205134103]
Training deep reinforcement learning (DRL) models usually requires high costs.
compressing DRL models possesses immense potential for training acceleration and model deployment.
We propose a novel sparse DRL training framework, "the textbfRigged textbfReinforcement textbfLearning textbfLottery" (RLx2)
arXiv Detail & Related papers (2022-05-30T12:18:43Z) - Training and Evaluation of Deep Policies using Reinforcement Learning
and Generative Models [67.78935378952146]
GenRL is a framework for solving sequential decision-making problems.
It exploits the combination of reinforcement learning and latent variable generative models.
We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training.
arXiv Detail & Related papers (2022-04-18T22:02:32Z) - Federated Deep Reinforcement Learning for the Distributed Control of
NextG Wireless Networks [16.12495409295754]
Next Generation (NextG) networks are expected to support demanding internet tactile applications such as augmented reality and connected autonomous vehicles.
Data-driven approaches can improve the ability of the network to adapt to the current operating conditions.
Deep RL (DRL) has been shown to achieve good performance even in complex environments.
arXiv Detail & Related papers (2021-12-07T03:13:20Z) - Recurrent Model-Free RL is a Strong Baseline for Many POMDPs [73.39666827525782]
Many problems in RL, such as meta RL, robust RL, and generalization in RL, can be cast as POMDPs.
In theory, simply augmenting model-free RL with memory, such as recurrent neural networks, provides a general approach to solving all types of POMDPs.
Prior work has found that such recurrent model-free RL methods tend to perform worse than more specialized algorithms that are designed for specific types of POMDPs.
arXiv Detail & Related papers (2021-10-11T07:09:14Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - Learning to Prune Deep Neural Networks via Reinforcement Learning [64.85939668308966]
PuRL is a deep reinforcement learning based algorithm for pruning neural networks.
It achieves sparsity and accuracy comparable to current state-of-the-art methods.
arXiv Detail & Related papers (2020-07-09T13:06:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.