Explicit Gradient Learning
- URL: http://arxiv.org/abs/2006.08711v1
- Date: Tue, 9 Jun 2020 08:56:24 GMT
- Title: Explicit Gradient Learning
- Authors: Mor Sinay, Elad Sarafian, Yoram Louzoun, Noa Agmon, Sarit Kraus
- Abstract summary: Black-Box Optimization (BBO) methods can find optimal systems with no analytical representation.
EGL trains a NN to estimate the objective gradient directly.
- Score: 28.844181847562695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Black-Box Optimization (BBO) methods can find optimal policies for systems
that interact with complex environments with no analytical representation. As
such, they are of interest in many Artificial Intelligence (AI) domains. Yet
classical BBO methods fall short in high-dimensional non-convex problems. They
are thus often overlooked in real-world AI tasks. Here we present a BBO method,
termed Explicit Gradient Learning (EGL), that is designed to optimize
high-dimensional ill-behaved functions. We derive EGL by finding weak-spots in
methods that fit the objective function with a parametric Neural Network (NN)
model and obtain the gradient signal by calculating the parametric gradient.
Instead of fitting the function, EGL trains a NN to estimate the objective
gradient directly. We prove the convergence of EGL in convex optimization and
its robustness in the optimization of integrable functions. We evaluate EGL and
achieve state-of-the-art results in two challenging problems: (1) the COCO test
suite against an assortment of standard BBO methods; and (2) in a
high-dimensional non-convex image generation task.
Related papers
- Optimistic Gradient Learning with Hessian Corrections for High-Dimensional Black-Box Optimization [14.073853819633745]
Black-box algorithms are designed to optimize functions without relying on their underlying analytical structure or gradient information.
We propose two novel gradient learning variants to address the challenges posed by high-dimensional, complex, and highly non-linear problems.
arXiv Detail & Related papers (2025-02-07T11:03:50Z) - qNBO: quasi-Newton Meets Bilevel Optimization [26.0555315825777]
Bilevel optimization, addressing challenges in hierarchical learning tasks, has gained significant interest in machine learning.
We introduce a general framework to address these computational challenges in a coordinated manner.
Specifically, we leverage quasi-Newton algorithms to accelerate the resolution of the lower-level problem while efficiently approximating the inverse Hessian-vector product.
arXiv Detail & Related papers (2025-02-03T05:36:45Z) - Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks [3.680127959836384]
implicit gradient descent (IGD) outperforms the common gradient descent (GD) in handling certain multi-scale problems.
We show that IGD converges a globally optimal solution at a linear convergence rate.
arXiv Detail & Related papers (2024-07-03T06:10:41Z) - Neural Gradient Learning and Optimization for Oriented Point Normal
Estimation [53.611206368815125]
We propose a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation.
We learn an angular distance field based on local plane geometry to refine the coarse gradient vectors.
Our method efficiently conducts global gradient approximation while achieving better accuracy and ability generalization of local feature description.
arXiv Detail & Related papers (2023-09-17T08:35:11Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Global Optimization with Parametric Function Approximation [19.699902334787325]
We consider the problem of global optimization with noisy zeroth order oracles.
We propose a new algorithm GO-UCB that leverages a parametric family of functions instead.
We show that GO-UCB achieves a cumulative regret of O$(sqrtT)$ where $T$ is the time horizon.
arXiv Detail & Related papers (2022-11-16T18:35:42Z) - ES-Based Jacobian Enables Faster Bilevel Optimization [53.675623215542515]
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems.
Existing gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations.
We propose a novel BO algorithm, which adopts Evolution Strategies (ES) based method to approximate the response Jacobian matrix in the hypergradient of BO.
arXiv Detail & Related papers (2021-10-13T19:36:50Z) - Leveraging Non-uniformity in First-order Non-convex Optimization [93.6817946818977]
Non-uniform refinement of objective functions leads to emphNon-uniform Smoothness (NS) and emphNon-uniform Lojasiewicz inequality (NL)
New definitions inspire new geometry-aware first-order methods that converge to global optimality faster than the classical $Omega (1/t2)$ lower bounds.
arXiv Detail & Related papers (2021-05-13T04:23:07Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.