A Novel Evolution Strategy with Directional Gaussian Smoothing for
Blackbox Optimization
- URL: http://arxiv.org/abs/2002.03001v2
- Date: Thu, 11 Jun 2020 19:14:44 GMT
- Title: A Novel Evolution Strategy with Directional Gaussian Smoothing for
Blackbox Optimization
- Authors: Jiaxin Zhang, Hoang Tran, Dan Lu, Guannan Zhang
- Abstract summary: We propose an improved evolution strategy (ES) using a novel nonlocal gradient operator for high-dimensional black-box optimization.
Standard ES methods with $d$-dimensional Gaussian smoothing suffer from the curse of dimensionality due to the high variance of Monte Carlo based gradient estimators.
- Score: 4.060323179287396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an improved evolution strategy (ES) using a novel nonlocal
gradient operator for high-dimensional black-box optimization. Standard ES
methods with $d$-dimensional Gaussian smoothing suffer from the curse of
dimensionality due to the high variance of Monte Carlo (MC) based gradient
estimators. To control the variance, Gaussian smoothing is usually limited in a
small region, so existing ES methods lack nonlocal exploration ability required
for escaping from local minima. We develop a nonlocal gradient operator with
directional Gaussian smoothing (DGS) to address this challenge. The DGS
conducts 1D nonlocal explorations along $d$ orthogonal directions in
$\mathbb{R}^d$, each of which defines a nonlocal directional derivative as a 1D
integral. We then use Gauss-Hermite quadrature, instead of MC sampling, to
estimate the $d$ 1D integrals to ensure high accuracy (i.e., small variance).
Our method enables effective nonlocal exploration to facilitate the global
search in high-dimensional optimization. We demonstrate the superior
performance of our method in three sets of examples, including benchmark
functions for global optimization, and real-world science and engineering
applications.
Related papers
- Neural Gradient Learning and Optimization for Oriented Point Normal
Estimation [53.611206368815125]
We propose a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation.
We learn an angular distance field based on local plane geometry to refine the coarse gradient vectors.
Our method efficiently conducts global gradient approximation while achieving better accuracy and ability generalization of local feature description.
arXiv Detail & Related papers (2023-09-17T08:35:11Z) - ELRA: Exponential learning rate adaption gradient descent optimization
method [83.88591755871734]
We present a novel, fast (exponential rate), ab initio (hyper-free) gradient based adaption.
The main idea of the method is to adapt the $alpha by situational awareness.
It can be applied to problems of any dimensions n and scales only linearly.
arXiv Detail & Related papers (2023-09-12T14:36:13Z) - Orthogonal Directions Constrained Gradient Method: from non-linear
equality constraints to Stiefel manifold [16.099883128428054]
We propose a novel algorithm, the Orthogonal Directions Constrained Method (ODCGM)
ODCGM only requires computing a projection onto a vector space.
We show that ODCGM exhibits the near-optimal oracle complexities.
arXiv Detail & Related papers (2023-03-16T12:25:53Z) - Variance-reduced Clipping for Non-convex Optimization [24.765794811146144]
Gradient clipping is a technique used in deep learning applications such as large-scale language modeling.
Recent experimental training have a fairly special behavior in that it mitigates order complexity.
arXiv Detail & Related papers (2023-03-02T00:57:38Z) - Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis [16.499651513178012]
Distributionally robust optimization (DRO) is a widely-used approach to learn models that are robust against distribution shift.
We provide non-asymptotic convergence guarantees even though the objective function is possibly prominent nonsmooth- and has normalized gradient descent.
arXiv Detail & Related papers (2021-10-24T14:56:38Z) - AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax
Optimization [104.96004056928474]
We propose a class of faster adaptive gradient descent methods for non-strongly-concave minimax problems.
We show that our method reaches a lower sample complexity of $O(kappa2.5epsilon-3)$ with the mini-batch size $O(kappa)$.
arXiv Detail & Related papers (2021-06-30T14:47:09Z) - Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product.
We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z) - AdaDGS: An adaptive black-box optimization method with a nonlocal
directional Gaussian smoothing gradient [3.1546318469750196]
A directional Gaussian smoothing (DGS) approach was recently proposed in (Zhang et al., 2020) and used to define a truly nonlocal gradient, referred to as the DGS gradient, for high-dimensional black-box optimization.
We present a simple, yet ingenious and efficient adaptive approach for optimization with the DGS gradient, which removes the need of hyper- parameter fine tuning.
arXiv Detail & Related papers (2020-11-03T21:20:25Z) - An adaptive stochastic gradient-free approach for high-dimensional
blackbox optimization [0.0]
We propose an adaptive gradient-free (ASGF) approach for high-dimensional non-smoothing problems.
We illustrate the performance of this method on benchmark global problems and learning tasks.
arXiv Detail & Related papers (2020-06-18T22:47:58Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.