Safe Learning under Uncertain Objectives and Constraints
- URL: http://arxiv.org/abs/2006.13326v1
- Date: Tue, 23 Jun 2020 20:51:00 GMT
- Title: Safe Learning under Uncertain Objectives and Constraints
- Authors: Mohammad Fereydounian, Zebang Shen, Aryan Mokhtari, Amin Karbasi,
Hamed Hassani
- Abstract summary: We consider non-textitunknown yet safety-critical optimization problems under textitunknown yet safety-critical constraints.
Such problems naturally arise in a variety of domains including robotics, manufacturing, and medical procedures.
A crucial component of our analysis is to introduce and apply a technique called shrinkage in the context of safe optimization.
- Score: 66.05180398174286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider non-convex optimization problems under
\textit{unknown} yet safety-critical constraints. Such problems naturally arise
in a variety of domains including robotics, manufacturing, and medical
procedures, where it is infeasible to know or identify all the constraints.
Therefore, the parameter space should be explored in a conservative way to
ensure that none of the constraints are violated during the optimization
process once we start from a safe initialization point. To this end, we develop
an algorithm called Reliable Frank-Wolfe (Reliable-FW). Given a general
non-convex function and an unknown polytope constraint, Reliable-FW
simultaneously learns the landscape of the objective function and the boundary
of the safety polytope. More precisely, by assuming that Reliable-FW has access
to a (stochastic) gradient oracle of the objective function and a noisy
feasibility oracle of the safety polytope, it finds an $\epsilon$-approximate
first-order stationary point with the optimal ${\mathcal{O}}({1}/{\epsilon^2})$
gradient oracle complexity (resp. $\tilde{\mathcal{O}}({1}/{\epsilon^3})$ (also
optimal) in the stochastic gradient setting), while ensuring the safety of all
the iterates. Rather surprisingly, Reliable-FW only makes
$\tilde{\mathcal{O}}(({d^2}/{\epsilon^2})\log 1/\delta)$ queries to the noisy
feasibility oracle (resp. $\tilde{\mathcal{O}}(({d^2}/{\epsilon^4})\log
1/\delta)$ in the stochastic gradient setting) where $d$ is the dimension and
$\delta$ is the reliability parameter, tightening the existing bounds even for
safe minimization of convex functions. We further specialize our results to the
case that the objective function is convex. A crucial component of our analysis
is to introduce and apply a technique called geometric shrinkage in the context
of safe optimization.
Related papers
- Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems.
Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z) - No-Regret Algorithms for Safe Bayesian Optimization with Monotonicity Constraints [41.04951588017592]
We consider the problem of sequentially maximizing an unknown function $f$ over a set of actions of the form $(s,mathbfx)$.
We show that a modified version of our algorithm is able to attain sublinear regret for the task of finding a near-optimal $s$ corresponding to every $mathbfx$.
arXiv Detail & Related papers (2024-06-05T13:41:26Z) - Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
We propose an online convex optimization approach with two different levels of adaptivity.
We obtain $mathcalO(log V_T)$, $mathcalO(d log V_T)$ and $hatmathcalO(sqrtV_T)$ regret bounds for strongly convex, exp-concave and convex loss functions.
arXiv Detail & Related papers (2023-07-17T09:55:35Z) - ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
We introduce a new tool for BFG convex optimization (SCO): a Reweighted Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density.
We develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings.
arXiv Detail & Related papers (2023-01-01T18:51:29Z) - Towards Painless Policy Optimization for Constrained MDPs [46.12526917024248]
We study policy optimization in an infinite horizon, $gamma$-discounted constrained Markov decision process (CMDP)
Our objective is to return a policy that achieves large expected reward with a small constraint violation.
We propose a generic primal-dual framework that allows us to bound the reward sub-optimality and constraint violation for arbitrary algorithms.
arXiv Detail & Related papers (2022-04-11T15:08:09Z) - Projection Efficient Subgradient Method and Optimal Nonsmooth
Frank-Wolfe Method [54.93433440034386]
We find a feasible $epsilon$-suboptimal solution using only $O(epsilon-1)$ PO calls and optimal $O(epsilon-2)$ FO calls.
Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls.
arXiv Detail & Related papers (2020-10-05T08:16:56Z) - Conservative Stochastic Optimization with Expectation Constraints [11.393603788068777]
This paper considers convex optimization problems where the objective and constraint functions involve expectations with respect to the data indices or environmental variables.
Online and efficient approaches for solving such problems have not been widely studied.
We propose a novel conservative optimization algorithm (CSOA) that achieves zero constraint violation and $Oleft(T-frac12right)$ optimality gap.
arXiv Detail & Related papers (2020-08-13T08:56:24Z) - Provably Efficient Safe Exploration via Primal-Dual Policy Optimization [105.7510838453122]
We study the Safe Reinforcement Learning (SRL) problem using the Constrained Markov Decision Process (CMDP) formulation.
We present an provably efficient online policy optimization algorithm for CMDP with safe exploration in the function approximation setting.
arXiv Detail & Related papers (2020-03-01T17:47:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.