Instance-Dependent Bounds for Zeroth-order Lipschitz Optimization with
Error Certificates
- URL: http://arxiv.org/abs/2102.01977v5
- Date: Wed, 22 Mar 2023 09:17:12 GMT
- Title: Instance-Dependent Bounds for Zeroth-order Lipschitz Optimization with
Error Certificates
- Authors: Fran\c{c}ois Bachoc (IMT, GdR MASCOT-NUM), Tommaso R Cesari (TSE-R),
S\'ebastien Gerchinovitz (IMT)
- Abstract summary: We study the problem of zeroth-order (black-box) optimization of a Lipschitz function $f$ defined on a compact subset $mathcal X$ of $mathbb Rd$.
We characterize the optimal number of evaluations of any Lipschitz function $f$ to find and certify an approximater of $f$ at accuracy $varepsilon$.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of zeroth-order (black-box) optimization of a Lipschitz
function $f$ defined on a compact subset $\mathcal X$ of $\mathbb R^d$, with
the additional constraint that algorithms must certify the accuracy of their
recommendations. We characterize the optimal number of evaluations of any
Lipschitz function $f$ to find and certify an approximate maximizer of $f$ at
accuracy $\varepsilon$. Under a weak assumption on $\mathcal X$, this optimal
sample complexity is shown to be nearly proportional to the integral
$\int_{\mathcal X} \mathrm{d}\boldsymbol x/( \max(f) - f(\boldsymbol x) +
\varepsilon )^d$. This result, which was only (and partially) known in
dimension $d=1$, solves an open problem dating back to 1991. In terms of
techniques, our upper bound relies on a packing bound by Bouttier al. (2020)
for the Piyavskii-Shubert algorithm that we link to the above integral. We also
show that a certified version of the computationally tractable DOO algorithm
matches these packing and integral bounds. Our instance-dependent lower bound
differs from traditional worst-case lower bounds in the Lipschitz setting and
relies on a local worst-case analysis that could likely prove useful for other
learning tasks.
Related papers
- Robust Distribution Learning with Local and Global Adversarial Corruptions [17.22168727622332]
We develop an efficient finite-sample algorithm with error bounded by $sqrtvarepsilon k + rho + tildeO(dsqrtkn-1/(k lor 2))$ when $P$ has bounded covariance.
Our efficient procedure relies on a novel trace norm approximation of an ideal yet intractable 2-Wasserstein projection estimator.
arXiv Detail & Related papers (2024-06-10T17:48:36Z) - Certified Multi-Fidelity Zeroth-Order Optimization [4.450536872346658]
We consider the problem of multi-fidelity zeroth-order optimization, where one can evaluate a function $f$ at various approximation levels.
We propose a certified variant of the MFDOO algorithm and derive a bound on its cost complexity for any Lipschitz function $f$.
We also prove an $f$-dependent lower bound showing that this algorithm has a near-optimal cost complexity.
arXiv Detail & Related papers (2023-08-02T07:20:37Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Tight Bounds for $\gamma$-Regret via the Decision-Estimation Coefficient [88.86699022151598]
We give a statistical characterization of the $gamma$-regret for arbitrary structured bandit problems.
The $gamma$-regret emerges in structured bandit problems over a function class $mathcalF$ where finding an exact optimum of $f in mathcalF$ is intractable.
arXiv Detail & Related papers (2023-03-06T17:54:33Z) - Private Isotonic Regression [54.32252900997422]
We consider the problem of isotonic regression over a partially ordered set (poset) $mathcalX$ and for any Lipschitz loss function.
We obtain a pure-DP algorithm that has an expected excess empirical risk of roughly $mathrmwidth(mathcalX) cdot log|mathcalX| / n$, where $mathrmwidth(mathcalX)$ is the width of the poset.
We show that the bounds above are essentially the best that can be
arXiv Detail & Related papers (2022-10-27T05:08:07Z) - An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum
Optimization [25.21457349137344]
We show a proof to show DEAREST requires at most $mathcal O(+sqrtmnLvarepsilon-2)$ first-order oracle (IFO) calls and $mathcal O(Lvarepsilon-2/sqrt1-lambda_W)$ communication rounds.
arXiv Detail & Related papers (2022-10-25T11:37:11Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits
with Linear Payoff Functions [53.77572276969548]
We show that the C$2$UCB algorithm has the optimal regret bound $tildeO(dsqrtkT + dk)$ for the partition matroid constraints.
For general constraints, we propose an algorithm that modifies the reward estimates of arms in the C$2$UCB algorithm.
arXiv Detail & Related papers (2021-01-20T04:29:18Z) - Greedy Adversarial Equilibrium: An Efficient Alternative to
Nonconvex-Nonconcave Min-Max Optimization [28.431572772564518]
We show that Lipschitzitz's $varepsilon$-greedy adversarial model converges from any starting point to a $max_z f(x, z)$.
We also show that Lipschitz's $nabla_y f(x,y)$ is in the dimension $d$, $1/varepsilon$, and the bounds on $nabla2_y f(x,y)$ are $nabla2_y.
arXiv Detail & Related papers (2020-06-22T16:03:41Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.