Analysis of Dual-Based PID Controllers through Convolutional Mirror
Descent
- URL: http://arxiv.org/abs/2202.06152v4
- Date: Tue, 19 Dec 2023 22:46:34 GMT
- Title: Analysis of Dual-Based PID Controllers through Convolutional Mirror
Descent
- Authors: Santiago R. Balseiro, Haihao Lu, Vahab Mirrokni, Balasubramanian Sivan
- Abstract summary: This paper provides the first regret bounds on the performance of dual-based PID controllers for online allocation problems.
We establish a fundamental connection between dual-based PID controllers and a new first-order algorithm for online convex optimization called emphConvolutional Mirror Descent (CMD)
We provide the first regret bound for CMD for non-smooth convex optimization, which might be of independent interest.
- Score: 20.512667802427675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dual-based proportional-integral-derivative (PID) controllers are often
employed in practice to solve online allocation problems with global
constraints, such as budget pacing in online advertising. However, controllers
are used in a heuristic fashion and come with no provable guarantees on their
performance. This paper provides the first regret bounds on the performance of
dual-based PID controllers for online allocation problems. We do so by first
establishing a fundamental connection between dual-based PID controllers and a
new first-order algorithm for online convex optimization called
\emph{Convolutional Mirror Descent} (CMD), which updates iterates based on a
weighted moving average of past gradients. CMD recovers, in a special case,
online mirror descent with momentum and optimistic mirror descent. We establish
sufficient conditions under which CMD attains low regret for general online
convex optimization problems with adversarial inputs. We leverage this new
result to give the first regret bound for dual-based PID controllers for online
allocation problems. As a byproduct of our proofs, we provide the first regret
bound for CMD for non-smooth convex optimization, which might be of independent
interest.
Related papers
- Joint Transmit and Pinching Beamforming for Pinching Antenna Systems (PASS): Optimization-Based or Learning-Based? [89.05848771674773]
A novel antenna system ()-enabled downlink multi-user multiple-input single-output (MISO) framework is proposed.
It consists of multiple waveguides, which equip numerous low-cost antennas, named (PAs)
The positions of PAs can be reconfigured to both spanning large-scale path and space.
arXiv Detail & Related papers (2025-02-12T18:54:10Z) - Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains.
Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint.
This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions.
The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Parameter-Adaptive Approximate MPC: Tuning Neural-Network Controllers without Retraining [50.00291020618743]
This work introduces a novel, parameter-adaptive AMPC architecture capable of online tuning without recomputing large datasets and retraining.
We showcase the effectiveness of parameter-adaptive AMPC by controlling the swing-ups of two different real cartpole systems with a severely resource-constrained microcontroller (MCU)
Taken together, these contributions represent a marked step toward the practical application of AMPC in real-world systems.
arXiv Detail & Related papers (2024-04-08T20:02:19Z) - Regret Analysis of Policy Optimization over Submanifolds for Linearly
Constrained Online LQG [12.201535821920624]
We study online linear quadratic Gaussian problems with a given linear constraint imposed on the controller.
We propose online optimistic Newton on manifold (OONM) which provides an online controller based on the prediction on the first and second order information of the function sequence.
arXiv Detail & Related papers (2024-03-13T14:06:18Z) - Beyond PID Controllers: PPO with Neuralized PID Policy for Proton Beam
Intensity Control in Mu2e [3.860979702631594]
We introduce a novel Proximal Policy Optimization (PPO) algorithm aimed at maintaining a uniform proton beam intensity delivery in the Muon to Electron Conversion Experiment (Mu2e) at Fermi National Accelerator Laboratory (Fermilab)
Our primary objective is to regulate the spill process to ensure a consistent intensity profile, with the ultimate goal of creating an automated controller capable of providing real-time feedback and calibration of the Spill Regulation System (SRS) parameters on a millisecond timescale.
arXiv Detail & Related papers (2023-12-28T21:35:20Z) - Meta-Learning Adversarial Bandit Algorithms [55.72892209124227]
We study online meta-learning with bandit feedback.
We learn to tune online mirror descent generalization (OMD) with self-concordant barrier regularizers.
arXiv Detail & Related papers (2023-07-05T13:52:10Z) - Meta-Learning Adversarial Bandits [49.094361442409785]
We study online learning with bandit feedback across multiple tasks, with the goal of improving average performance across tasks if they are similar according to some natural task-similarity measure.
As the first to target the adversarial setting, we design a meta-algorithm that setting-specific guarantees for two important cases: multi-armed bandits (MAB) and bandit optimization (BLO)
Our guarantees rely on proving that unregularized follow-the-leader combined with multiplicative weights is enough to online learn a non-smooth and non-B sequence.
arXiv Detail & Related papers (2022-05-27T17:40:32Z) - On Centralized and Distributed Mirror Descent: Exponential Convergence
Analysis Using Quadratic Constraints [8.336315962271396]
Mirror descent (MD) is a powerful first-order optimization technique that subsumes several algorithms including gradient descent (GD)
We study the exact convergence rate of MD in both centralized and distributed cases for strongly convex and smooth problems.
arXiv Detail & Related papers (2021-05-29T23:05:56Z) - Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction [94.25865526414717]
This paper considers a new problem of adapting a pre-trained model of human mesh reconstruction to out-of-domain streaming videos.
We propose Bilevel Online Adaptation, which divides the optimization process of overall multi-objective into two steps of weight probe and weight update in a training.
We demonstrate that BOA leads to state-of-the-art results on two human mesh reconstruction benchmarks.
arXiv Detail & Related papers (2021-03-30T15:47:58Z) - Online mirror descent and dual averaging: keeping pace in the dynamic
case [11.572321455920164]
Online mirror descent (OMD) and dual averaging (DA) are fundamental algorithms for online convex optimization.
We modify the OMD algorithm through a simple technique that we call stabilization.
We show that OMD with stabilization and DA enjoy the same performance guarantees in many applications -- even under dynamic learning rates.
arXiv Detail & Related papers (2020-06-03T23:41:40Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.