AdaDGS: An adaptive black-box optimization method with a nonlocal
directional Gaussian smoothing gradient
- URL: http://arxiv.org/abs/2011.02009v1
- Date: Tue, 3 Nov 2020 21:20:25 GMT
- Title: AdaDGS: An adaptive black-box optimization method with a nonlocal
directional Gaussian smoothing gradient
- Authors: Hoang Tran and Guannan Zhang
- Abstract summary: A directional Gaussian smoothing (DGS) approach was recently proposed in (Zhang et al., 2020) and used to define a truly nonlocal gradient, referred to as the DGS gradient, for high-dimensional black-box optimization.
We present a simple, yet ingenious and efficient adaptive approach for optimization with the DGS gradient, which removes the need of hyper- parameter fine tuning.
- Score: 3.1546318469750196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The local gradient points to the direction of the steepest slope in an
infinitesimal neighborhood. An optimizer guided by the local gradient is often
trapped in local optima when the loss landscape is multi-modal. A directional
Gaussian smoothing (DGS) approach was recently proposed in (Zhang et al., 2020)
and used to define a truly nonlocal gradient, referred to as the DGS gradient,
for high-dimensional black-box optimization. Promising results show that
replacing the traditional local gradient with the DGS gradient can
significantly improve the performance of gradient-based methods in optimizing
highly multi-modal loss functions. However, the optimal performance of the DGS
gradient may rely on fine tuning of two important hyper-parameters, i.e., the
smoothing radius and the learning rate. In this paper, we present a simple, yet
ingenious and efficient adaptive approach for optimization with the DGS
gradient, which removes the need of hyper-parameter fine tuning. Since the DGS
gradient generally points to a good search direction, we perform a line search
along the DGS direction to determine the step size at each iteration. The
learned step size in turn will inform us of the scale of function landscape in
the surrounding area, based on which we adjust the smoothing radius accordingly
for the next iteration. We present experimental results on high-dimensional
benchmark functions, an airfoil design problem and a game content generation
problem. The AdaDGS method has shown superior performance over several the
state-of-the-art black-box optimization methods.
Related papers
- Neural Gradient Learning and Optimization for Oriented Point Normal
Estimation [53.611206368815125]
We propose a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation.
We learn an angular distance field based on local plane geometry to refine the coarse gradient vectors.
Our method efficiently conducts global gradient approximation while achieving better accuracy and ability generalization of local feature description.
arXiv Detail & Related papers (2023-09-17T08:35:11Z) - ELRA: Exponential learning rate adaption gradient descent optimization
method [83.88591755871734]
We present a novel, fast (exponential rate), ab initio (hyper-free) gradient based adaption.
The main idea of the method is to adapt the $alpha by situational awareness.
It can be applied to problems of any dimensions n and scales only linearly.
arXiv Detail & Related papers (2023-09-12T14:36:13Z) - Adaptive Proximal Gradient Method for Convex Optimization [18.681222155879656]
We explore two fundamental first-order algorithms in convex optimization, namely gradient descent (GD) and proximal gradient method (ProxGD)
Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions.
We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs.
arXiv Detail & Related papers (2023-08-04T11:37:08Z) - Gradient Correction beyond Gradient Descent [63.33439072360198]
gradient correction is apparently the most crucial aspect for the training of a neural network.
We introduce a framework (textbfGCGD) to perform gradient correction.
Experiment results show that our gradient correction framework can effectively improve the gradient quality to reduce training epochs by $sim$ 20% and also improve the network performance.
arXiv Detail & Related papers (2022-03-16T01:42:25Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Channel-Directed Gradients for Optimization of Convolutional Neural
Networks [50.34913837546743]
We introduce optimization methods for convolutional neural networks that can be used to improve existing gradient-based optimization in terms of generalization error.
We show that defining the gradients along the output channel direction leads to a performance boost, while other directions can be detrimental.
arXiv Detail & Related papers (2020-08-25T00:44:09Z) - An adaptive stochastic gradient-free approach for high-dimensional
blackbox optimization [0.0]
We propose an adaptive gradient-free (ASGF) approach for high-dimensional non-smoothing problems.
We illustrate the performance of this method on benchmark global problems and learning tasks.
arXiv Detail & Related papers (2020-06-18T22:47:58Z) - A Novel Evolution Strategy with Directional Gaussian Smoothing for
Blackbox Optimization [4.060323179287396]
We propose an improved evolution strategy (ES) using a novel nonlocal gradient operator for high-dimensional black-box optimization.
Standard ES methods with $d$-dimensional Gaussian smoothing suffer from the curse of dimensionality due to the high variance of Monte Carlo based gradient estimators.
arXiv Detail & Related papers (2020-02-07T20:17:19Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.