Adaptive Sampling Quasi-Newton Methods for Zeroth-Order Stochastic
Optimization
- URL: http://arxiv.org/abs/2109.12213v1
- Date: Fri, 24 Sep 2021 21:49:25 GMT
- Title: Adaptive Sampling Quasi-Newton Methods for Zeroth-Order Stochastic
Optimization
- Authors: Raghu Bollapragada and Stefan M. Wild
- Abstract summary: We consider unconstrained optimization problems with no available gradient information.
We propose an adaptive sampling quasi-Newton method where we estimate the gradients of a simulation function using finite differences within a common random number framework.
We develop modified versions of a norm test and an inner product quasi-Newton test to control the sample sizes used in the approximations and provide global convergence results to the neighborhood of the optimal solution.
- Score: 1.7513645771137178
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider unconstrained stochastic optimization problems with no available
gradient information. Such problems arise in settings from derivative-free
simulation optimization to reinforcement learning. We propose an adaptive
sampling quasi-Newton method where we estimate the gradients of a stochastic
function using finite differences within a common random number framework. We
develop modified versions of a norm test and an inner product quasi-Newton test
to control the sample sizes used in the stochastic approximations and provide
global convergence results to the neighborhood of the optimal solution. We
present numerical experiments on simulation optimization problems to illustrate
the performance of the proposed algorithm. When compared with classical
zeroth-order stochastic gradient methods, we observe that our strategies of
adapting the sample sizes significantly improve performance in terms of the
number of stochastic function evaluations required.
Related papers
- Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization [0.0]
We propose a novel algorithm that extends the methods of ball smoothing and Gaussian smoothing for noisy derivative-free optimization.
The algorithm dynamically adapts the shape of the smoothing kernel to approximate the Hessian of the objective function around a local optimum.
arXiv Detail & Related papers (2024-05-02T21:04:20Z) - On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization [10.36447258513813]
We consider a regularized expected reward optimization problem in the non-oblivious setting that covers many existing problems in reinforcement learning (RL)
In particular, the method has shown to admit an $O(epsilon-4)$ sample to an $epsilon$-stationary point, under standard conditions.
Our analysis shows that the sample complexity can be improved from $O(epsilon-4)$ to $O(epsilon-3)$ under additional conditions.
arXiv Detail & Related papers (2024-01-23T06:01:29Z) - Backward error analysis and the qualitative behaviour of stochastic
optimization algorithms: Application to stochastic coordinate descent [1.534667887016089]
We propose a class of differential equations that approximate the dynamics of general optimization methods more closely than the original gradient flow.
We study the stability of the modified equation in the case of coordinate descent.
arXiv Detail & Related papers (2023-09-05T09:39:56Z) - Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives [1.7640556247739623]
We analyze algorithms for zeroth-order entropy composite objectives, focusing on dependence on dimensionality.
This is achieved by exploiting low dimensional structure of the decision set using the mirror descent method with an estimation alike function.
To improve the gradient, we replace the classic sampling method based on Rademacher and show that the mini-batch method copes with non-Eucli geometry.
arXiv Detail & Related papers (2022-08-09T07:36:25Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - On Stochastic Moving-Average Estimators for Non-Convex Optimization [105.22760323075008]
In this paper, we demonstrate the power of a widely used estimator based on moving average (SEMA) problems.
For all these-the-art results, we also present the results for all these-the-art problems.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Sequential Subspace Search for Functional Bayesian Optimization
Incorporating Experimenter Intuition [63.011641517977644]
Our algorithm generates a sequence of finite-dimensional random subspaces of functional space spanned by a set of draws from the experimenter's Gaussian Process.
Standard Bayesian optimisation is applied on each subspace, and the best solution found used as a starting point (origin) for the next subspace.
We test our algorithm in simulated and real-world experiments, namely blind function matching, finding the optimal precipitation-strengthening function for an aluminium alloy, and learning rate schedule optimisation for deep networks.
arXiv Detail & Related papers (2020-09-08T06:54:11Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic
Optimization Problems [12.010310883787911]
We analyze a new family of adaptive subgradient methods for solving an important class of weakly convex (possibly nonsmooth) optimization problems.
Experimental results indicate how the proposed algorithms empirically outperform its zerothorder gradient descent and its design variant.
arXiv Detail & Related papers (2020-05-19T07:44:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.