Related papers: Adaptive Real Time Exploration and Optimization for Safety-Critical Systems

Adaptive Real Time Exploration and Optimization for Safety-Critical Systems

URL: http://arxiv.org/abs/2211.05495v1
Date: Thu, 10 Nov 2022 11:37:22 GMT
Title: Adaptive Real Time Exploration and Optimization for Safety-Critical Systems
Authors: Buse Sibel Korkmaz (1), Mehmet Mercang\"oz (1), Marta Zag\'orowska (2) ((1) Imperial College London, (2) ETH Z\"urich)
Abstract summary: We propose the ARTEO algorithm, where we cast multi-armed bandits as a programming problem subject to safety constraints. We learn the environmental characteristics through changes in optimization inputs and through exploration. Compared to existing safe-learning approaches, our algorithm does not require an exclusive exploration phase and follows the optimization goals even in the explored points.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We consider the problem of decision-making under uncertainty in an environment with safety constraints. Many business and industrial applications rely on real-time optimization with changing inputs to improve key performance indicators. In the case of unknown environmental characteristics, real-time optimization becomes challenging, particularly for the satisfaction of safety constraints. We propose the ARTEO algorithm, where we cast multi-armed bandits as a mathematical programming problem subject to safety constraints and learn the environmental characteristics through changes in optimization inputs and through exploration. We quantify the uncertainty in unknown characteristics by using Gaussian processes and incorporate it into the utility function as a contribution which drives exploration. We adaptively control the size of this contribution using a heuristic in accordance with the requirements of the environment. We guarantee the safety of our algorithm with a high probability through confidence bounds constructed under the regularity assumptions of Gaussian processes. Compared to existing safe-learning approaches, our algorithm does not require an exclusive exploration phase and follows the optimization goals even in the explored points, which makes it suitable for safety-critical systems. We demonstrate the safety and efficiency of our approach with two experiments: an industrial process and an online bid optimization benchmark problem.

Related papers

Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel [4.586346034304039]
TVSafeOpt is an algorithm for time-varying optimization problems with unknown reward and safety functions. TVSafeOpt is capable of safely tracking a time-varying safe region without need for explicit change detection. We show that TVSafeOpt compares favorably against SafeOpt on synthetic data, both regarding safety and optimality.
arXiv Detail & Related papers (2024-09-26T16:09:19Z)
Information-Theoretic Safe Bayesian Optimization [59.758009422067005]
We consider a sequential decision making task, where the goal is to optimize an unknown function without evaluating parameters that violate an unknown (safety) constraint. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. We propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate.
arXiv Detail & Related papers (2024-02-23T14:31:10Z)
SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization [1.3597551064547502]
This study introduces a novel safe reinforcement learning algorithm, Safety Critic Policy Optimization. In this study, we define the safety critic, a mechanism that nullifies rewards obtained through violating safety constraints. Our theoretical analysis indicates that the proposed algorithm can automatically balance the trade-off between adhering to safety constraints and maximizing rewards.
arXiv Detail & Related papers (2023-11-01T22:12:50Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Meta-Learning Priors for Safe Bayesian Optimization [72.8349503901712]
We build on a meta-learning algorithm, F-PACOH, capable of providing reliable uncertainty quantification in settings of data scarcity. As core contribution, we develop a novel framework for choosing safety-compliant priors in a data-riven manner. On benchmark functions and a high-precision motion system, we demonstrate that our meta-learned priors accelerate the convergence of safe BO approaches.
arXiv Detail & Related papers (2022-10-03T08:38:38Z)
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial. Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size. We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z)
Are Evolutionary Algorithms Safe Optimizers? [3.3044468943230427]
This paper aims to reignite interest in safe optimization problems (SafeOPs) in the evolutionary computation (EC) community. We provide a formal definition of SafeOPs, investigate the impact of key SafeOP parameters on the performance of selected safe optimization algorithms, and benchmark EC against state-of-the-art safe optimization algorithms.
arXiv Detail & Related papers (2022-03-24T17:11:36Z)
Safe Online Bid Optimization with Return-On-Investment and Budget Constraints subject to Uncertainty [87.81197574939355]
We study the nature of both the optimization and learning problems. We provide an algorithm, namely GCB, guaranteeing sublinear regret at the cost of a potentially linear number of constraints violations. More interestingly, we provide an algorithm, namely GCB_safe(psi,phi), guaranteeing both sublinear pseudo-regret and safety w.h.p. at the cost of accepting tolerances psi and phi.
arXiv Detail & Related papers (2022-01-18T17:24:20Z)
Safe Policy Optimization with Local Generalized Linear Function Approximations [17.84511819022308]
Existing safe exploration methods guaranteed safety under the assumption of regularity. We propose a novel algorithm, SPO-LF, that optimize an agent's policy while learning the relation between a locally available feature obtained by sensors and environmental reward/safety. We experimentally show that our algorithm is 1) more efficient in terms of sample complexity and computational cost and 2) more applicable to large-scale problems than previous safe RL methods with theoretical guarantees.
arXiv Detail & Related papers (2021-11-09T00:47:50Z)
Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training. We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.