Related papers: A Deep Reinforcement Learning Framework for Optimizing Congestion Control in Data Centers

A Deep Reinforcement Learning Framework for Optimizing Congestion Control in Data Centers

URL: http://arxiv.org/abs/2301.12558v1
Date: Sun, 29 Jan 2023 22:08:35 GMT
Title: A Deep Reinforcement Learning Framework for Optimizing Congestion Control in Data Centers
Authors: Shiva Ketabi, Hongkai Chen, Haiwei Dong, Yashar Ganjali
Abstract summary: Various congestion control protocols have been designed to achieve high performance in different network environments. Modern online learning solutions that delegate the congestion control actions to a machine cannot properly converge in the stringent time scales of data centers. We leverage multiagent reinforcement learning to design a system for dynamic tuning of congestion control parameters at end-hosts in a data center.
Score: 2.310582065745938
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Various congestion control protocols have been designed to achieve high performance in different network environments. Modern online learning solutions that delegate the congestion control actions to a machine cannot properly converge in the stringent time scales of data centers. We leverage multiagent reinforcement learning to design a system for dynamic tuning of congestion control parameters at end-hosts in a data center. The system includes agents at the end-hosts to monitor and report the network and traffic states, and agents to run the reinforcement learning algorithm given the states. Based on the state of the environment, the system generates congestion control parameters that optimize network performance metrics such as throughput and latency. As a case study, we examine BBR, an example of a prominent recently-developed congestion control protocol. Our experiments demonstrate that the proposed system has the potential to mitigate the problems of static parameters.

Related papers

Communication-Control Codesign for Large-Scale Wireless Networked Control Systems [80.30532872347668]
Wireless Networked Control Systems (WNCSs) are essential to Industry 4.0, enabling flexible control in applications, such as drone swarms and autonomous robots. We propose a practical WNCS model that captures correlated dynamics among multiple control loops with spatially distributed sensors and actuators sharing limited wireless resources over multi-state Markov block-fading channels. We develop a Deep Reinforcement Learning (DRL) algorithm that efficiently handles the hybrid action space, captures communication-control correlations, and ensures robust training despite sparse cross-domain variables and floating control inputs.
arXiv Detail & Related papers (2024-10-15T06:28:21Z)
A Decentralized and Self-Adaptive Approach for Monitoring Volatile Edge Environments [40.96858640950632]
We propose DEMon, a decentralized self-adaptive monitoring system for edge. We implement the proposed system as a lightweight and portable container-based system and evaluate it through experiments. The results show that DEMon efficiently disseminates and retrieves the monitoring information, addressing the challenges of edge monitoring.
arXiv Detail & Related papers (2024-05-13T14:47:34Z)
ReACT: Reinforcement Learning for Controller Parametrization using B-Spline Geometries [0.0]
This work presents a novel approach using deep reinforcement learning (DRL) with N-dimensional B-spline geometries (BSGs) We focus on the control of parameter-variant systems, a class of systems with complex behavior which depends on the operating conditions. We make the adaptation process more efficient by introducing BSGs to map the controller parameters which may depend on numerous operating conditions.
arXiv Detail & Related papers (2024-01-10T16:27:30Z)
Perimeter Control with Heterogeneous Metering Rates for Cordon Signals: A Physics-Regularized Multi-Agent Reinforcement Learning Approach [12.86346901414289]
Perimeter Control (PC) strategies have been proposed to address urban road network control in oversaturated situations. This paper leverages a Multi-Agent Reinforcement Learning (MARL)-based traffic signal control framework to decompose this PC problem. A physics regularization approach for the MARL framework is proposed to ensure the distributed cordon signal controllers are aware of the global network state.
arXiv Detail & Related papers (2023-08-24T13:51:16Z)
Deep Learning for Wireless Networked Systems: a joint Estimation-Control-Scheduling Approach [47.29474858956844]
Wireless networked control system (WNCS) connecting sensors, controllers, and actuators via wireless communications is a key enabling technology for highly scalable and low-cost deployment of control systems in the Industry 4.0 era. Despite the tight interaction of control and communications in WNCSs, most existing works adopt separative design approaches. We propose a novel deep reinforcement learning (DRL)-based algorithm for controller and optimization utilizing both model-free and model-based data.
arXiv Detail & Related papers (2022-10-03T01:29:40Z)
Deep Reinforcement Learning for Wireless Scheduling in Distributed Networked Control [37.10638636086814]
We consider a joint uplink and downlink scheduling problem of a fully distributed wireless control system (WNCS) with a limited number of frequency channels. We develop a deep reinforcement learning (DRL) based framework for solving it. To tackle the challenges of a large action space in DRL, we propose novel action space reduction and action embedding methods.
arXiv Detail & Related papers (2021-09-26T11:27:12Z)
Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput. Until today, no such learning-based algorithms have shown practical potential in this domain. We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks. We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z)
Decentralized Control with Graph Neural Networks [147.84766857793247]
We propose a novel framework using graph neural networks (GNNs) to learn decentralized controllers. GNNs are well-suited for the task since they are naturally distributed architectures and exhibit good scalability and transferability properties. The problems of flocking and multi-agent path planning are explored to illustrate the potential of GNNs in learning decentralized controllers.
arXiv Detail & Related papers (2020-12-29T18:59:14Z)
Multi-UAV Path Planning for Wireless Data Harvesting with Deep Reinforcement Learning [18.266087952180733]
We propose a multi-agent reinforcement learning (MARL) approach that can adapt to profound changes in the scenario parameters defining the data harvesting mission. We show that our proposed network architecture enables the agents to cooperate effectively by carefully dividing the data collection task among themselves.
arXiv Detail & Related papers (2020-10-23T14:59:30Z)
Adaptive Subcarrier, Parameter, and Power Allocation for Partitioned Edge Learning Over Broadband Channels [69.18343801164741]
partitioned edge learning (PARTEL) implements parameter-server training, a well known distributed learning method, in wireless network. We consider the case of deep neural network (DNN) models which can be trained using PARTEL by introducing some auxiliary variables.
arXiv Detail & Related papers (2020-10-08T15:27:50Z)
Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search. We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.