Related papers: Compressing Deep Reinforcement Learning Networks with a Dynamic Structured Pruning Method for Autonomous Driving

Compressing Deep Reinforcement Learning Networks with a Dynamic Structured Pruning Method for Autonomous Driving

URL: http://arxiv.org/abs/2402.05146v1
Date: Wed, 7 Feb 2024 09:00:30 GMT
Title: Compressing Deep Reinforcement Learning Networks with a Dynamic Structured Pruning Method for Autonomous Driving
Authors: Wensheng Su, Zhenni Li, Minrui Xu, Jiawen Kang, Dusit Niyato, Shengli Xie
Abstract summary: Deep reinforcement learning (DRL) has shown remarkable success in complex autonomous driving scenarios. DRL models inevitably bring high memory consumption and computation, which hinders their wide deployment in resource-limited autonomous driving devices. We introduce a novel dynamic structured pruning approach that gradually removes a DRL model's unimportant neurons during the training stage.
Score: 63.155562267383864
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep reinforcement learning (DRL) has shown remarkable success in complex autonomous driving scenarios. However, DRL models inevitably bring high memory consumption and computation, which hinders their wide deployment in resource-limited autonomous driving devices. Structured Pruning has been recognized as a useful method to compress and accelerate DRL models, but it is still challenging to estimate the contribution of a parameter (i.e., neuron) to DRL models. In this paper, we introduce a novel dynamic structured pruning approach that gradually removes a DRL model's unimportant neurons during the training stage. Our method consists of two steps, i.e. training DRL models with a group sparse regularizer and removing unimportant neurons with a dynamic pruning threshold. To efficiently train the DRL model with a small number of important neurons, we employ a neuron-importance group sparse regularizer. In contrast to conventional regularizers, this regularizer imposes a penalty on redundant groups of neurons that do not significantly influence the output of the DRL model. Furthermore, we design a novel structured pruning strategy to dynamically determine the pruning threshold and gradually remove unimportant neurons with a binary mask. Therefore, our method can remove not only redundant groups of neurons of the DRL model but also achieve high and robust performance. Experimental results show that the proposed method is competitive with existing DRL pruning methods on discrete control environments (i.e., CartPole-v1 and LunarLander-v2) and MuJoCo continuous environments (i.e., Hopper-v3 and Walker2D-v3). Specifically, our method effectively compresses $93\%$ neurons and $96\%$ weights of the DRL model in four challenging DRL environments with slight accuracy degradation.

Related papers

Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining [74.83412846804977]
Reinforcement learning (RL)-based fine-tuning has become a crucial step in post-training language models. We present a systematic end-to-end study of RL fine-tuning for mathematical reasoning by training models entirely from scratch.
arXiv Detail & Related papers (2025-04-10T17:15:53Z)
The Impact of Quantization and Pruning on Deep Reinforcement Learning Models [1.5252729367921107]
Deep reinforcement learning (DRL) has achieved remarkable success across various domains, such as video games, robotics, and, recently, large language models. However, the computational costs and memory requirements of DRL models often limit their deployment in resource-constrained environments. Our study investigates the impact of two prominent compression methods, quantization and pruning on DRL models.
arXiv Detail & Related papers (2024-07-05T18:21:17Z)
A Real-World Quadrupedal Locomotion Benchmark for Offline Reinforcement Learning [27.00483962026472]
We benchmark 11 offline reinforcement learning algorithms in realistic quadrupedal locomotion dataset. Experiments show that the best-performing ORL algorithms can achieve competitive performance compared with the model-free RL. Our proposed benchmark will serve as a development platform for testing and evaluating the performance of ORL algorithms in real-world legged locomotion tasks.
arXiv Detail & Related papers (2023-09-13T13:18:29Z)
Direct Preference Optimization: Your Language Model is Secretly a Reward Model [119.65409513119963]
We introduce a new parameterization of the reward model in RLHF that enables extraction of the corresponding optimal policy in closed form. The resulting algorithm, which we call Direct Preference Optimization (DPO), is stable, performant, and computationally lightweight. Our experiments show that DPO can fine-tune LMs to align with human preferences as well as or better than existing methods.
arXiv Detail & Related papers (2023-05-29T17:57:46Z)
Turbulence control in plane Couette flow using low-dimensional neural ODE-based models and deep reinforcement learning [0.0]
"DManD-RL" (data-driven manifold dynamics-RL) generates a data-driven low-dimensional model of our system. We train an RL control agent, yielding a 440-fold speedup over training on a numerical simulation. The agent learns a policy that laminarizes 84% of unseen DNS test trajectories within 900 time units.
arXiv Detail & Related papers (2023-01-28T05:47:10Z)
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch [23.104546205134103]
Training deep reinforcement learning (DRL) models usually requires high costs. compressing DRL models possesses immense potential for training acceleration and model deployment. We propose a novel sparse DRL training framework, "the textbfRigged textbfReinforcement textbfLearning textbfLottery" (RLx2)
arXiv Detail & Related papers (2022-05-30T12:18:43Z)
Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models [67.78935378952146]
GenRL is a framework for solving sequential decision-making problems. It exploits the combination of reinforcement learning and latent variable generative models. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training.
arXiv Detail & Related papers (2022-04-18T22:02:32Z)
Federated Deep Reinforcement Learning for the Distributed Control of NextG Wireless Networks [16.12495409295754]
Next Generation (NextG) networks are expected to support demanding internet tactile applications such as augmented reality and connected autonomous vehicles. Data-driven approaches can improve the ability of the network to adapt to the current operating conditions. Deep RL (DRL) has been shown to achieve good performance even in complex environments.
arXiv Detail & Related papers (2021-12-07T03:13:20Z)
Recurrent Model-Free RL is a Strong Baseline for Many POMDPs [73.39666827525782]
Many problems in RL, such as meta RL, robust RL, and generalization in RL, can be cast as POMDPs. In theory, simply augmenting model-free RL with memory, such as recurrent neural networks, provides a general approach to solving all types of POMDPs. Prior work has found that such recurrent model-free RL methods tend to perform worse than more specialized algorithms that are designed for specific types of POMDPs.
arXiv Detail & Related papers (2021-10-11T07:09:14Z)
Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time. To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations. We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z)
Learning to Prune Deep Neural Networks via Reinforcement Learning [64.85939668308966]
PuRL is a deep reinforcement learning based algorithm for pruning neural networks. It achieves sparsity and accuracy comparable to current state-of-the-art methods.
arXiv Detail & Related papers (2020-07-09T13:06:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.