A Hierarchical Surrogate Model for Efficient Multi-Task Parameter Learning in Closed-Loop Control
- URL: http://arxiv.org/abs/2508.12738v2
- Date: Tue, 19 Aug 2025 06:46:05 GMT
- Title: A Hierarchical Surrogate Model for Efficient Multi-Task Parameter Learning in Closed-Loop Control
- Authors: Sebastian Hirt, Lukas Theiner, Maik Pfefferkorn, Rolf Findeisen,
- Abstract summary: We propose a hierarchical Bayesian optimization (BO) framework tailored to efficient controller parameter learning.<n>Instead of treating the closed-loop cost as a black-box, our method exploits structural knowledge of the underlying problem.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many control problems require repeated tuning and adaptation of controllers across distinct closed-loop tasks, where data efficiency and adaptability are critical. We propose a hierarchical Bayesian optimization (BO) framework that is tailored to efficient controller parameter learning in sequential decision-making and control scenarios for distinct tasks. Instead of treating the closed-loop cost as a black-box, our method exploits structural knowledge of the underlying problem, consisting of a dynamical system, a control law, and an associated closed-loop cost function. We construct a hierarchical surrogate model using Gaussian processes that capture the closed-loop state evolution under different parameterizations, while the task-specific weighting and accumulation into the closed-loop cost are computed exactly via known closed-form expressions. This allows knowledge transfer and enhanced data efficiency between different closed-loop tasks. The proposed framework retains sublinear regret guarantees on par with standard black-box BO, while enabling multi-task or transfer learning. Simulation experiments with model predictive control demonstrate substantial benefits in both sample efficiency and adaptability when compared to purely black-box BO approaches.
Related papers
- On Geometric Structures for Policy Parameterization in Continuous Control [7.056222499095849]
We propose a novel, computationally efficient action generation paradigm that preserves the structural benefits of operating on a unit manifold.<n>Our method decomposes the action into a deterministic directional vector and a learnable concentration, enabling efficient between the target direction and uniform noise.<n> Empirically, our method matches or exceeds state-of-the-art methods on standard continuous control benchmarks.
arXiv Detail & Related papers (2025-11-11T13:32:38Z) - Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation [55.75188191403343]
We introduce utility, which is a function predefined by each user and describes the trade-off between cost and performance of BO.
We validate our algorithm on various LC datasets and found it outperform all the previous multi-fidelity BO and transfer-BO baselines we consider.
arXiv Detail & Related papers (2024-05-28T07:38:39Z) - M-HOF-Opt: Multi-Objective Hierarchical Output Feedback Optimization via Multiplier Induced Loss Landscape Scheduling [4.369346338392536]
A probabilistic graphical model is proposed, modeling the joint model parameter and multiplier evolution.<n>We address multi-objective model parameter optimization via a surrogate single objective penalty loss.
arXiv Detail & Related papers (2024-03-20T16:38:26Z) - Ensemble Kalman Filtering Meets Gaussian Process SSM for Non-Mean-Field and Online Inference [47.460898983429374]
We introduce an ensemble Kalman filter (EnKF) into the non-mean-field (NMF) variational inference framework to approximate the posterior distribution of the latent states.
This novel marriage between EnKF and GPSSM not only eliminates the need for extensive parameterization in learning variational distributions, but also enables an interpretable, closed-form approximation of the evidence lower bound (ELBO)
We demonstrate that the resulting EnKF-aided online algorithm embodies a principled objective function by ensuring data-fitting accuracy while incorporating model regularizations to mitigate overfitting.
arXiv Detail & Related papers (2023-12-10T15:22:30Z) - Tuning Legged Locomotion Controllers via Safe Bayesian Optimization [47.87675010450171]
This paper presents a data-driven strategy to streamline the deployment of model-based controllers in legged robotic hardware platforms.
We leverage a model-free safe learning algorithm to automate the tuning of control gains, addressing the mismatch between the simplified model used in the control formulation and the real system.
arXiv Detail & Related papers (2023-06-12T13:10:14Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - Optimal Control of Nonlinear Systems with Unknown Dynamics [4.551160285910024]
This paper presents a data-driven method for finding a closed-loop optimal controller.<n>It minimizes a specified infinite-horizon cost function for systems with unknown dynamics given any arbitrary initial state.
arXiv Detail & Related papers (2023-05-24T14:27:22Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Data-Driven Optimized Tracking Control Heuristic for MIMO Structures: A
Balance System Case Study [8.035375408614776]
The PID is illustrated on a two-input two-output balance system.
It integrates a self-adjusting nonlinear threshold with a neural network to compromise between the desired transient and steady state characteristics.
The neural network is trained upon optimizing a weighted-derivative like objective cost function.
arXiv Detail & Related papers (2021-04-01T02:00:20Z) - Safe and Efficient Model-free Adaptive Control via Bayesian Optimization [39.962395119933596]
We propose a purely data-driven, model-free approach for adaptive control.
tuning low-level controllers based solely on system data raises concerns on the underlying algorithm safety and computational performance.
We numerically demonstrate for several types of disturbances that our approach is sample efficient, outperforms constrained Bayesian optimization in terms of safety, and achieves the performance optima computed by grid evaluation.
arXiv Detail & Related papers (2021-01-19T19:15:00Z) - Towards Interpretable-AI Policies Induction using Evolutionary Nonlinear
Decision Trees for Discrete Action Systems [8.322816790979285]
We use a recently proposed nonlinear decision-tree (NLDT) approach to find a hierarchical set of control rules.
We find relatively simple and interpretable rules involving one to four non-linear terms per rule, while simultaneously achieving on par closed-loop performance.
arXiv Detail & Related papers (2020-09-20T20:41:57Z) - Anticipating the Long-Term Effect of Online Learning in Control [75.6527644813815]
AntLer is a design algorithm for learning-based control laws that anticipates learning.
We show that AntLer approximates an optimal solution arbitrarily accurately with probability one.
arXiv Detail & Related papers (2020-07-24T07:00:14Z) - Trajectory Optimization for Nonlinear Multi-Agent Systems using
Decentralized Learning Model Predictive Control [5.2647625557619815]
We present a decentralized minimum-time trajectory optimization scheme based on learning model predictive control for multi-agent systems with nonlinear decoupled dynamics and coupled state constraints.
Our framework results in a decentralized controller, which requires no communication between agents over each iteration of task execution, and guarantees persistent feasibility, finite-time closed-loop convergence, and non-decreasing performance of the global system over task iterations.
arXiv Detail & Related papers (2020-04-02T23:04:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.