NOVAS: Non-convex Optimization via Adaptive Stochastic Search for
End-to-End Learning and Control
- URL: http://arxiv.org/abs/2006.11992v3
- Date: Thu, 1 Apr 2021 19:43:46 GMT
- Title: NOVAS: Non-convex Optimization via Adaptive Stochastic Search for
End-to-End Learning and Control
- Authors: Ioannis Exarchos and Marcus A. Pereira and Ziyi Wang and Evangelos A.
Theodorou
- Abstract summary: We propose the use of adaptive search as a building block for general, non- neural optimization operations.
We benchmark it against two existing alternatives on a synthetic energy-based structured task, and showcase its use in optimal control applications.
- Score: 22.120942106939122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work we propose the use of adaptive stochastic search as a building
block for general, non-convex optimization operations within deep neural
network architectures. Specifically, for an objective function located at some
layer in the network and parameterized by some network parameters, we employ
adaptive stochastic search to perform optimization over its output. This
operation is differentiable and does not obstruct the passing of gradients
during backpropagation, thus enabling us to incorporate it as a component in
end-to-end learning. We study the proposed optimization module's properties and
benchmark it against two existing alternatives on a synthetic energy-based
structured prediction task, and further showcase its use in stochastic optimal
control applications.
Related papers
- Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Bidirectional Looking with A Novel Double Exponential Moving Average to
Adaptive and Non-adaptive Momentum Optimizers [109.52244418498974]
We propose a novel textscAdmeta (textbfADouble exponential textbfMov averagtextbfE textbfAdaptive and non-adaptive momentum) framework.
We provide two implementations, textscAdmetaR and textscAdmetaS, the former based on RAdam and the latter based on SGDM.
arXiv Detail & Related papers (2023-07-02T18:16:06Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Transformer-Based Learned Optimization [37.84626515073609]
We propose a new approach to learned optimization where we represent the computation's update step using a neural network.
Our innovation is a new neural network architecture inspired by the classic BFGS algorithm.
We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for the evaluation of optimization algorithms.
arXiv Detail & Related papers (2022-12-02T09:47:08Z) - Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest.
Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree.
We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z) - Bayesian Optimization for auto-tuning GPU kernels [0.0]
Finding optimal parameter configurations for GPU kernels is a non-trivial exercise for large search spaces, even when automated.
We introduce a novel contextual exploration factor as well as new acquisition functions with improved scalability, combined with an informed function selection mechanism.
arXiv Detail & Related papers (2021-11-26T11:26:26Z) - Additive Tree-Structured Conditional Parameter Spaces in Bayesian
Optimization: A Novel Covariance Function and a Fast Implementation [34.89735938765757]
We generalize the additive assumption to tree-structured functions, showing improved sample-efficiency, wider applicability and greater flexibility.
By incorporating the structure information of parameter spaces and the additive assumption in the BO loop, we develop a parallel algorithm to optimize the acquisition function.
We demonstrate our method on an optimization benchmark function, on pruning pre-trained VGG16 and Res50 models as well as on searching activation functions of ResNet20.
arXiv Detail & Related papers (2020-10-06T16:08:58Z) - Adaptive pruning-based optimization of parameterized quantum circuits [62.997667081978825]
Variisy hybrid quantum-classical algorithms are powerful tools to maximize the use of Noisy Intermediate Scale Quantum devices.
We propose a strategy for such ansatze used in variational quantum algorithms, which we call "Efficient Circuit Training" (PECT)
Instead of optimizing all of the ansatz parameters at once, PECT launches a sequence of variational algorithms.
arXiv Detail & Related papers (2020-10-01T18:14:11Z) - Additive Tree-Structured Covariance Function for Conditional Parameter
Spaces in Bayesian Optimization [34.89735938765757]
We generalize the additive assumption to tree-structured functions.
By incorporating the structure information of parameter spaces and the additive assumption in the BO loop, we develop a parallel algorithm to optimize the acquisition function.
arXiv Detail & Related papers (2020-06-21T11:21:55Z) - Adaptive Stochastic Optimization [1.7945141391585486]
Adaptive optimization methods have the potential to offer significant computational savings when training large-scale systems.
Modern approaches based on the gradient method are non-adaptive in the sense that their implementation employs prescribed parameter values that need to be tuned for each application.
arXiv Detail & Related papers (2020-01-18T16:30:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.