Sample Complexity for Quadratic Bandits: Hessian Dependent Bounds and
Optimal Algorithms
- URL: http://arxiv.org/abs/2306.12383v3
- Date: Mon, 25 Dec 2023 06:41:11 GMT
- Title: Sample Complexity for Quadratic Bandits: Hessian Dependent Bounds and
Optimal Algorithms
- Authors: Qian Yu, Yining Wang, Baihe Huang, Qi Lei, Jason D. Lee
- Abstract summary: We show the first tight characterization of the optimal Hessian-dependent sample complexity.
A Hessian-independent algorithm universally achieves the optimal sample complexities for all Hessian instances.
The optimal sample complexities achieved by our algorithm remain valid for heavy-tailed noise distributions.
- Score: 64.10576998630981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In stochastic zeroth-order optimization, a problem of practical relevance is
understanding how to fully exploit the local geometry of the underlying
objective function. We consider a fundamental setting in which the objective
function is quadratic, and provide the first tight characterization of the
optimal Hessian-dependent sample complexity. Our contribution is twofold.
First, from an information-theoretic point of view, we prove tight lower bounds
on Hessian-dependent complexities by introducing a concept called energy
allocation, which captures the interaction between the searching algorithm and
the geometry of objective functions. A matching upper bound is obtained by
solving the optimal energy spectrum. Then, algorithmically, we show the
existence of a Hessian-independent algorithm that universally achieves the
asymptotic optimal sample complexities for all Hessian instances. The optimal
sample complexities achieved by our algorithm remain valid for heavy-tailed
noise distributions, which are enabled by a truncation method.
Related papers
- Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Proximal Oracles for Optimization and Sampling [18.77973093341588]
We consider convex optimization with non-smooth objective function and log-concave sampling with non-smooth potential.
To overcome the challenges caused by non-smoothness, our algorithms employ two powerful proximal frameworks in optimization and sampling.
arXiv Detail & Related papers (2024-04-02T18:52:28Z) - Accelerated Cyclic Coordinate Dual Averaging with Extrapolation for
Composite Convex Optimization [20.11028799145883]
We propose an Accelerated Cyclic Coordinate Dual Averaging with Extrapolation (A-CODER) method for composite convex optimization.
We show that A-CODER attains the optimal convergence rate with improved dependence on the number of blocks compared to prior work.
For the setting where the smooth component of the objective function is expressible in a finite sum form, we introduce a variance-reduced variant of A-CODER, VR-A-CODER, with state-of-the-art complexity guarantees.
arXiv Detail & Related papers (2023-03-28T19:46:30Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - An Algebraically Converging Stochastic Gradient Descent Algorithm for
Global Optimization [14.336473214524663]
A key component in the algorithm is the randomness based on the value of the objective function.
We prove the convergence of the algorithm with an algebra and tuning in the parameter space.
We present several numerical examples to demonstrate the efficiency and robustness of the algorithm.
arXiv Detail & Related papers (2022-04-12T16:27:49Z) - A Granular Sieving Algorithm for Deterministic Global Optimization [6.01919376499018]
A gradient-free deterministic method is developed to solve global optimization problems for Lipschitz continuous functions.
The method can be regarded as granular sieving with synchronous analysis in both the domain and range of the objective function.
arXiv Detail & Related papers (2021-07-14T10:03:03Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Accelerated Algorithms for Convex and Non-Convex Optimization on
Manifolds [9.632674803757475]
We propose a scheme for solving convex and non- optimization problems on distance.
Our proposed algorithm adapts to the level of complexity in the objective function.
arXiv Detail & Related papers (2020-10-18T02:48:22Z) - Sequential Subspace Search for Functional Bayesian Optimization
Incorporating Experimenter Intuition [63.011641517977644]
Our algorithm generates a sequence of finite-dimensional random subspaces of functional space spanned by a set of draws from the experimenter's Gaussian Process.
Standard Bayesian optimisation is applied on each subspace, and the best solution found used as a starting point (origin) for the next subspace.
We test our algorithm in simulated and real-world experiments, namely blind function matching, finding the optimal precipitation-strengthening function for an aluminium alloy, and learning rate schedule optimisation for deep networks.
arXiv Detail & Related papers (2020-09-08T06:54:11Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.