Plume: A Framework for High Performance Deep RL Network Controllers via
Prioritized Trace Sampling
- URL: http://arxiv.org/abs/2302.12403v2
- Date: Sun, 12 Nov 2023 06:50:51 GMT
- Title: Plume: A Framework for High Performance Deep RL Network Controllers via
Prioritized Trace Sampling
- Authors: Sagar Patel, Junyang Zhang, Sangeetha Abdu Jyothi, Nina Narodytska
- Abstract summary: We introduce a framework, Plume, to automatically identify and balance the skewed input trace distribution in DRL training datasets.
We evaluate Plume on three networking environments, including Adaptive Bitrate Streaming, Congestion Control, and Load Balancing.
Plume offers superior performance in both simulation and real-world settings, across different controllers and DRL algorithms.
- Score: 8.917042313344943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Reinforcement Learning (DRL) has shown promise in various networking
environments. However, these environments present several fundamental
challenges for standard DRL techniques. They are difficult to explore and
exhibit high levels of noise and uncertainty. Although these challenges
complicate the training process, we find that in practice we can substantially
mitigate their effects and even achieve state-of-the-art real-world performance
by addressing a factor that has been previously overlooked: the skewed input
trace distribution in DRL training datasets.
We introduce a generalized framework, Plume, to automatically identify and
balance the skew using a three-stage process. First, we identify the critical
features that determine the behavior of the traces. Second, we classify the
traces into clusters. Finally, we prioritize the salient clusters to improve
the overall performance of the controller. Plume seamlessly works across DRL
algorithms, without requiring any changes to the DRL workflow. We evaluated
Plume on three networking environments, including Adaptive Bitrate Streaming,
Congestion Control, and Load Balancing. Plume offers superior performance in
both simulation and real-world settings, across different controllers and DRL
algorithms. For example, our novel ABR controller, Gelato trained with Plume
consistently outperforms prior state-of-the-art controllers on the live
streaming platform Puffer for over a year. It is the first controller on the
platform to deliver statistically significant improvements in both video
quality and stalling, decreasing stalls by as much as 75%.
Related papers
- Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining [74.83412846804977]
Reinforcement learning (RL)-based fine-tuning has become a crucial step in post-training language models.
We present a systematic end-to-end study of RL fine-tuning for mathematical reasoning by training models entirely from scratch.
arXiv Detail & Related papers (2025-04-10T17:15:53Z) - Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network [23.481553466650453]
We propose Auto-Regressive Soft Q-learning (ARSQ), a value-based RL algorithm that models Q-values in a coarse-to-fine, auto-regressive manner.
ARSQ decomposes the continuous action space into discrete spaces in a coarse-to-fine hierarchy, enhancing sample efficiency for fine-grained continuous control tasks.
It auto-regressively predicts dimensional action advantages within each decision step, enabling more effective decision-making in continuous control tasks.
arXiv Detail & Related papers (2025-02-01T03:04:53Z) - RLPP: A Residual Method for Zero-Shot Real-World Autonomous Racing on Scaled Platforms [9.517327026260181]
We propose RLPP, a residual RL framework that enhances a Pure Pursuit controller with an RL-based residual.
RLPP improves lap times of the baseline controllers by up to 6.37 %, closing the gap to the State-of-the-Art methods by more than 52 %.
RLPP is made available as an open-source tool, encouraging further exploration and advancement in autonomous racing research.
arXiv Detail & Related papers (2025-01-28T21:48:18Z) - D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - How Does Forecasting Affect the Convergence of DRL Techniques in O-RAN
Slicing? [20.344810727033327]
We propose a novel forecasting-aided DRL approach and its respective O-RAN practical deployment workflow to enhance DRL convergence.
Our approach shows up to 22.8%, 86.3%, and 300% improvements in the average initial reward value, convergence rate, and number of converged scenarios respectively.
arXiv Detail & Related papers (2023-09-01T14:30:04Z) - DRL4Route: A Deep Reinforcement Learning Framework for Pick-up and
Delivery Route Prediction [21.335721424944257]
We present the first attempt to generalize Reinforcement Learning (RL) to the route prediction task, leading to a novel RL-based framework called DRL4Route.
DRL4Route can serve as a plug-and-play component to boost the existing deep learning models.
It follows the actor-critic architecture which is equipped with a Generalized Advantage Estimator.
arXiv Detail & Related papers (2023-07-30T14:50:31Z) - Efficient Diffusion Policies for Offline Reinforcement Learning [85.73757789282212]
Diffsuion-QL significantly boosts the performance of offline RL by representing a policy with a diffusion model.
We propose efficient diffusion policy (EDP) to overcome these two challenges.
EDP constructs actions from corrupted ones at training to avoid running the sampling chain.
arXiv Detail & Related papers (2023-05-31T17:55:21Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Single-Shot Pruning for Offline Reinforcement Learning [47.886329599997474]
Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems.
One way to tackle this problem is to prune neural networks leaving only the necessary parameters.
We close the gap between RL and single-shot pruning techniques and present a general pruning approach to the Offline RL.
arXiv Detail & Related papers (2021-12-31T18:10:02Z) - Federated Deep Reinforcement Learning for the Distributed Control of
NextG Wireless Networks [16.12495409295754]
Next Generation (NextG) networks are expected to support demanding internet tactile applications such as augmented reality and connected autonomous vehicles.
Data-driven approaches can improve the ability of the network to adapt to the current operating conditions.
Deep RL (DRL) has been shown to achieve good performance even in complex environments.
arXiv Detail & Related papers (2021-12-07T03:13:20Z) - Optimizing Mixed Autonomy Traffic Flow With Decentralized Autonomous
Vehicles and Multi-Agent RL [63.52264764099532]
We study the ability of autonomous vehicles to improve the throughput of a bottleneck using a fully decentralized control scheme in a mixed autonomy setting.
We apply multi-agent reinforcement algorithms to this problem and demonstrate that significant improvements in bottleneck throughput, from 20% at a 5% penetration rate to 33% at a 40% penetration rate, can be achieved.
arXiv Detail & Related papers (2020-10-30T22:06:05Z) - Learning to Prune Deep Neural Networks via Reinforcement Learning [64.85939668308966]
PuRL is a deep reinforcement learning based algorithm for pruning neural networks.
It achieves sparsity and accuracy comparable to current state-of-the-art methods.
arXiv Detail & Related papers (2020-07-09T13:06:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.