Towards Model-Free LQR Control over Rate-Limited Channels
- URL: http://arxiv.org/abs/2401.01258v2
- Date: Fri, 2 Aug 2024 03:39:18 GMT
- Title: Towards Model-Free LQR Control over Rate-Limited Channels
- Authors: Aritra Mitra, Lintao Ye, Vijay Gupta,
- Abstract summary: We study a setting where a worker agent transmits quantized policy gradients (of the LQR cost) to a server over a noiseless channel with a finite bit-rate.
We propose a new algorithm titled Adaptively Quantized Gradient Descent (textttAQGD), and prove that above a certain finite threshold bit-rate, textttAQGD guarantees exponentially fast convergence to the globally optimal policy.
- Score: 2.908482270923597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given the success of model-free methods for control design in many problem settings, it is natural to ask how things will change if realistic communication channels are utilized for the transmission of gradients or policies. While the resulting problem has analogies with the formulations studied under the rubric of networked control systems, the rich literature in that area has typically assumed that the model of the system is known. As a step towards bridging the fields of model-free control design and networked control systems, we ask: \textit{Is it possible to solve basic control problems - such as the linear quadratic regulator (LQR) problem - in a model-free manner over a rate-limited channel?} Toward answering this question, we study a setting where a worker agent transmits quantized policy gradients (of the LQR cost) to a server over a noiseless channel with a finite bit-rate. We propose a new algorithm titled Adaptively Quantized Gradient Descent (\texttt{AQGD}), and prove that above a certain finite threshold bit-rate, \texttt{AQGD} guarantees exponentially fast convergence to the globally optimal policy, with \textit{no deterioration of the exponent relative to the unquantized setting}. More generally, our approach reveals the benefits of adaptive quantization in preserving fast linear convergence rates, and, as such, may be of independent interest to the literature on compressed optimization.
Related papers
- Learning Resilient Radio Resource Management Policies with Graph Neural
Networks [124.89036526192268]
We formulate a resilient radio resource management problem with per-user minimum-capacity constraints.
We show that we can parameterize the user selection and power control policies using a finite set of parameters.
Thanks to such adaptation, our proposed method achieves a superior tradeoff between the average rate and the 5th percentile rate.
arXiv Detail & Related papers (2022-03-07T19:40:39Z) - Linear Stochastic Bandits over a Bit-Constrained Channel [37.01818450308119]
We introduce a new linear bandit formulation over a bit-constrained channel.
The goal of the server is to take actions based on estimates of an unknown model parameter to minimize cumulative regret.
We prove that when the unknown model is $d$-dimensional, a channel capacity of $O(d)$ bits suffices to achieve order-optimal regret.
arXiv Detail & Related papers (2022-03-02T15:54:03Z) - Deep Reinforcement Learning for Wireless Scheduling in Distributed Networked Control [37.10638636086814]
We consider a joint uplink and downlink scheduling problem of a fully distributed wireless control system (WNCS) with a limited number of frequency channels.
We develop a deep reinforcement learning (DRL) based framework for solving it.
To tackle the challenges of a large action space in DRL, we propose novel action space reduction and action embedding methods.
arXiv Detail & Related papers (2021-09-26T11:27:12Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Control of Stochastic Quantum Dynamics with Differentiable Programming [0.0]
We propose a framework for the automated design of control schemes based on differentiable programming.
We apply this approach to state preparation and stabilization of a qubit subjected to homodyne detection.
Despite the resulting poor signal-to-noise ratio, we can train our controller to prepare and stabilize the qubit to a target state with a mean fidelity around 85%.
arXiv Detail & Related papers (2021-01-04T19:00:03Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Chance-Constrained Control with Lexicographic Deep Reinforcement
Learning [77.34726150561087]
This paper proposes a lexicographic Deep Reinforcement Learning (DeepRL)-based approach to chance-constrained Markov Decision Processes.
A lexicographic version of the well-known DeepRL algorithm DQN is also proposed and validated via simulations.
arXiv Detail & Related papers (2020-10-19T13:09:14Z) - Adaptive Subcarrier, Parameter, and Power Allocation for Partitioned
Edge Learning Over Broadband Channels [69.18343801164741]
partitioned edge learning (PARTEL) implements parameter-server training, a well known distributed learning method, in wireless network.
We consider the case of deep neural network (DNN) models which can be trained using PARTEL by introducing some auxiliary variables.
arXiv Detail & Related papers (2020-10-08T15:27:50Z) - Closed-loop Parameter Identification of Linear Dynamical Systems through
the Lens of Feedback Channel Coding Theory [0.0]
This paper considers the problem of closed-loop identification of linear scalar systems with Gaussian process noise.
We show that the learning rate is fundamentally upper bounded by the capacity of the corresponding AWGN channel.
Although the optimal design of the feedback policy remains challenging, we derive conditions under which the upper bound is achieved.
arXiv Detail & Related papers (2020-03-27T17:30:10Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.