Related papers: Time-Series-Informed Closed-loop Learning for Sequential Decision Making and Control

Time-Series-Informed Closed-loop Learning for Sequential Decision Making and Control

URL: http://arxiv.org/abs/2412.02423v1
Date: Tue, 03 Dec 2024 12:38:53 GMT
Title: Time-Series-Informed Closed-loop Learning for Sequential Decision Making and Control
Authors: Sebastian Hirt, Lukas Theiner, Rolf Findeisen,
Abstract summary: Traditional Bayesian optimization approaches treat the learning problem as a black box, ignoring valuable information and knowledge about the structure of the underlying problem.<n>We propose a time-series-informed optimization framework that incorporates intermediate performance evaluations from early iterations of each experimental episode into the learning procedure.<n>We show that our approach achieves baseline performance with approximately half the resources and outperforms the baseline in terms of final closed-loop performance.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Closed-loop performance of sequential decision making algorithms, such as model predictive control, depends strongly on the parameters of cost functions, models, and constraints. Bayesian optimization is a common approach to learning these parameters based on closed-loop experiments. However, traditional Bayesian optimization approaches treat the learning problem as a black box, ignoring valuable information and knowledge about the structure of the underlying problem, resulting in slow convergence and high experimental resource use. We propose a time-series-informed optimization framework that incorporates intermediate performance evaluations from early iterations of each experimental episode into the learning procedure. Additionally, probabilistic early stopping criteria are proposed to terminate unpromising experiments, significantly reducing experimental time. Simulation results show that our approach achieves baseline performance with approximately half the resources. Moreover, with the same resource budget, our approach outperforms the baseline in terms of final closed-loop performance, highlighting its efficiency in sequential decision making scenarios.

Related papers

Real-Time Iteration Scheme for Diffusion Policy [23.124189676943757]
We introduce a novel approach inspired by the Real-Time Iteration (RTI) Scheme to accelerate inference.<n>We propose a scaling-based method to effectively handle discrete actions, such as grasping, in robotic manipulation.<n>The proposed scheme significantly reduces runtime computational costs without the need for distillation or policy redesign.
arXiv Detail & Related papers (2025-08-07T13:49:00Z)
Online Decision-Focused Learning [63.83903681295497]
Decision-focused learning (DFL) is an increasingly popular paradigm for training predictive models whose outputs are used in decision-making tasks.<n>We investigate DFL in dynamic environments where the objective function does not evolve over time.<n>We establish bounds on the expected dynamic regret, both when decision space is a simplex and when it is a general bounded convex polytope.
arXiv Detail & Related papers (2025-05-19T10:40:30Z)
Bayesian Optimization for Non-Convex Two-Stage Stochastic Optimization Problems [2.9016548477524156]
We formulate a knowledge-gradient-based acquisition function to jointly optimize the first variables, establish a guarantee of consistency, and provide an approximation. We demonstrate comparable empirical results to an alternative we formulate with fewer focuss, which alternates between the two variable types.
arXiv Detail & Related papers (2024-08-30T16:26:31Z)
Sublinear Regret for a Class of Continuous-Time Linear--Quadratic Reinforcement Learning Problems [10.404992912881601]
We study reinforcement learning for a class of continuous-time linear-quadratic (LQ) control problems for diffusions. We apply a model-free approach that relies neither on knowledge of model parameters nor on their estimations, and devise an actor-critic algorithm to learn the optimal policy parameter directly.
arXiv Detail & Related papers (2024-07-24T12:26:21Z)
A Trajectory-Based Bayesian Approach to Multi-Objective Hyperparameter Optimization with Epoch-Aware Trade-Offs [8.598456741786801]
Training machine learning models inherently involve a resource-intensive and noisy iterative learning procedure.<n>We present a trajectory-based multi-objective Bayesian optimization algorithm characterized by two features.<n> Experiments show that our algorithm can effectively identify the desirable trade-offs while improving tuning efficiency.
arXiv Detail & Related papers (2024-05-24T07:43:45Z)
Learning From Scenarios for Stochastic Repairable Scheduling [3.9948520633731026]
We show how decision-focused learning techniques based on smoothing can be adapted to a scheduling problem. We include an experimental evaluation to investigate in which situations decision-focused learning outperforms the state of the art for such situations: scenario-based optimization.
arXiv Detail & Related papers (2023-12-06T13:32:17Z)
Predict-Then-Optimize by Proxy: Learning Joint Models of Prediction and Optimization [59.386153202037086]
Predict-Then- framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving. This approach can be inefficient and requires handcrafted, problem-specific rules for backpropagation through the optimization step. This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by predictive models.
arXiv Detail & Related papers (2023-11-22T01:32:06Z)
Primal-Dual Contextual Bayesian Optimization for Control System Online Optimization with Time-Average Constraints [21.38692458445459]
This paper studies the problem of online performance optimization of constrained closed-loop control systems. A primal-dual contextual Bayesian optimization algorithm is proposed that achieves sublinear cumulative regret with respect to the dynamic optimal solution.
arXiv Detail & Related papers (2023-04-12T18:37:52Z)
Efficient Real-world Testing of Causal Decision Making via Bayesian Experimental Design for Contextual Optimisation [12.37745209793872]
We introduce a model-agnostic framework for gathering data to evaluate and improve contextual decision making. Our method is used for the data-efficient evaluation of the regret of past treatment assignments.
arXiv Detail & Related papers (2022-07-12T01:20:11Z)
Fast Rates for Contextual Linear Optimization [52.39202699484225]
We show that a naive plug-in approach achieves regret convergence rates that are significantly faster than methods that directly optimize downstream decision performance. Our results are overall positive for practice: predictive models are easy and fast to train using existing tools, simple to interpret, and, as we show, lead to decisions that perform very well.
arXiv Detail & Related papers (2020-11-05T18:43:59Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures [62.41541049302712]
We propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures. Our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods.
arXiv Detail & Related papers (2020-05-15T09:54:09Z)
Time-varying Gaussian Process Bandit Optimization with Non-constant Evaluation Time [93.6788993843846]
We propose a novel time-varying Bayesian optimization algorithm that can effectively handle the non-constant evaluation time. Our bound elucidates that a pattern of the evaluation time sequence can hugely affect the difficulty of the problem.
arXiv Detail & Related papers (2020-03-10T13:28:33Z)
Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant. Our approach relies on perturbeds, and can be used readily together with existing solvers. We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.