Related papers: On Controller Tuning with Time-Varying Bayesian Optimization

On Controller Tuning with Time-Varying Bayesian Optimization

URL: http://arxiv.org/abs/2207.11120v1
Date: Fri, 22 Jul 2022 14:54:13 GMT
Title: On Controller Tuning with Time-Varying Bayesian Optimization
Authors: Paul Brunzema and Alexander von Rohr and Sebastian Trimpe
Abstract summary: We will use time-varying optimization (TVBO) to tune controllers online in changing environments using appropriate prior knowledge on the control objective and its changes. We propose a novel TVBO strategy using Uncertainty-Injection (UI), which incorporates the assumption of incremental and lasting changes. Our model outperforms the state-of-the-art method in TVBO, exhibiting reduced regret and fewer unstable parameter configurations.
Score: 74.57758188038375
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Changing conditions or environments can cause system dynamics to vary over time. To ensure optimal control performance, controllers should adapt to these changes. When the underlying cause and time of change is unknown, we need to rely on online data for this adaptation. In this paper, we will use time-varying Bayesian optimization (TVBO) to tune controllers online in changing environments using appropriate prior knowledge on the control objective and its changes. Two properties are characteristic of many online controller tuning problems: First, they exhibit incremental and lasting changes in the objective due to changes to the system dynamics, e.g., through wear and tear. Second, the optimization problem is convex in the tuning parameters. Current TVBO methods do not explicitly account for these properties, resulting in poor tuning performance and many unstable controllers through over-exploration of the parameter space. We propose a novel TVBO forgetting strategy using Uncertainty-Injection (UI), which incorporates the assumption of incremental and lasting changes. The control objective is modeled as a spatio-temporal Gaussian process (GP) with UI through a Wiener process in the temporal domain. Further, we explicitly model the convexity assumptions in the spatial dimension through GP models with linear inequality constraints. In numerical experiments, we show that our model outperforms the state-of-the-art method in TVBO, exhibiting reduced regret and fewer unstable parameter configurations.

Related papers

CANet: ChronoAdaptive Network for Enhanced Long-Term Time Series Forecasting under Non-Stationarity [0.0]
We introduce a novel architecture, ChoronoAdaptive Network (CANet), inspired by style-transfer techniques. The core of CANet is the Non-stationary Adaptive Normalization module, seamlessly integrating the Style Blending Gate and Adaptive Instance Normalization (AdaIN) experiments on real-world datasets validate CANet's superiority over state-of-the-art methods, achieving a 42% reduction in MSE and a 22% reduction in MAE.
arXiv Detail & Related papers (2025-04-24T20:05:33Z)
Large Continual Instruction Assistant [59.585544987096974]
Continual Instruction Tuning (CIT) is adopted to instruct Large Models to follow human intent data by data. Existing update gradient would heavily destroy the performance on previous datasets during CIT process. We propose a general continual instruction tuning framework to address the challenge.
arXiv Detail & Related papers (2024-10-08T11:24:59Z)
Parameter-Adaptive Approximate MPC: Tuning Neural-Network Controllers without Retraining [50.00291020618743]
This work introduces a novel, parameter-adaptive AMPC architecture capable of online tuning without recomputing large datasets and retraining. We showcase the effectiveness of parameter-adaptive AMPC by controlling the swing-ups of two different real cartpole systems with a severely resource-constrained microcontroller (MCU) Taken together, these contributions represent a marked step toward the practical application of AMPC in real-world systems.
arXiv Detail & Related papers (2024-04-08T20:02:19Z)
Sub-linear Regret in Adaptive Model Predictive Control [56.705978425244496]
We present STT-MPC (Self-Tuning Tube-based Model Predictive Control), an online oracle that combines the certainty-equivalence principle and polytopic tubes. We analyze the regret of the algorithm, when compared to an algorithm initially aware of the system dynamics.
arXiv Detail & Related papers (2023-10-07T15:07:10Z)
Self-Tuning PID Control via a Hybrid Actor-Critic-Based Neural Structure for Quadcopter Control [0.0]
Proportional-Integrator-Derivative (PID) controller is used in a wide range of industrial and experimental processes. Due to the uncertainty of model parameters and external disturbances, real systems such as Quadrotors need more robust and reliable PID controllers. In this research, a self-tuning PID controller using a Reinforcement-Learning-based Neural Network has been investigated.
arXiv Detail & Related papers (2023-07-03T19:35:52Z)
PTP: Boosting Stability and Performance of Prompt Tuning with Perturbation-Based Regularizer [94.23904400441957]
We introduce perturbation-based regularizers, which can smooth the loss landscape, into prompt tuning. We design two kinds of perturbation-based regularizers, including random-noise-based and adversarial-based. Our new algorithms improve the state-of-the-art prompt tuning methods by 1.94% and 2.34% on SuperGLUE and FewGLUE benchmarks, respectively.
arXiv Detail & Related papers (2023-05-03T20:30:51Z)
Performance-Driven Controller Tuning via Derivative-Free Reinforcement Learning [6.5158195776494]
We tackle the controller tuning problem using a novel derivative-free reinforcement learning framework. We conduct numerical experiments on two concrete examples from autonomous driving, namely, adaptive cruise control with PID controller and trajectory tracking with MPC controller. Experimental results show that the proposed method outperforms popular baselines and highlight its strong potential for controller tuning.
arXiv Detail & Related papers (2022-09-11T13:01:14Z)
Event-Triggered Time-Varying Bayesian Optimization [47.30677525394649]
We propose an event-triggered algorithm that treats the optimization problem as static until it detects changes in the objective function and then resets the dataset. This allows the algorithm to adapt online to realized temporal changes without the need for exact prior knowledge. We derive regret bounds of adaptive resets without exact prior knowledge on the temporal changes, and show in numerical experiments that ET-GP-UCB outperforms state-of-the-art algorithms on both synthetic and real-world data.
arXiv Detail & Related papers (2022-08-23T07:50:52Z)
Adaptive Model Predictive Control by Learning Classifiers [26.052368583196426]
We propose an adaptive MPC variant that automatically estimates control and model parameters. We leverage recent results showing that BO can be formulated as a density ratio estimation. This is then integrated into a model predictive path integral control framework yielding robust controllers for a variety of challenging robotics tasks.
arXiv Detail & Related papers (2022-03-13T23:22:12Z)
Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems [91.43582419264763]
We study the problem of system identification and adaptive control in partially observable linear dynamical systems. We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification. We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
arXiv Detail & Related papers (2020-03-25T06:00:33Z)
Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty. LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
Online Parameter Estimation for Safety-Critical Systems with Gaussian Processes [6.122161391301866]
We present a Bayesian optimization framework based on Gaussian processes (GPs) for online parameter estimation. It uses an efficient search strategy over a response surface in the parameter space for finding the global optima with minimal function evaluations. We demonstrate our technique on an actuated planar pendulum and safety-critical quadrotor in simulation with changing parameters.
arXiv Detail & Related papers (2020-02-18T20:38:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.