Non-local Optimization: Imposing Structure on Optimization Problems by
Relaxation
- URL: http://arxiv.org/abs/2011.06064v3
- Date: Sat, 24 Jul 2021 13:20:12 GMT
- Title: Non-local Optimization: Imposing Structure on Optimization Problems by
Relaxation
- Authors: Nils M\"uller and Tobias Glasmachers
- Abstract summary: In evolutionary computation and reinforcement learning, the optimization of a function $f: Omega to mathbbR$ is often addressed through optimizing a so-called relaxation $theta in Theta mapsto mathbbE_theta(f)$ of $f$.
We investigate the structure of such relaxations by means of measure theory and Fourier analysis, enabling us to shed light on the success of many associated optimization methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In stochastic optimization, particularly in evolutionary computation and
reinforcement learning, the optimization of a function $f: \Omega \to
\mathbb{R}$ is often addressed through optimizing a so-called relaxation
$\theta \in \Theta \mapsto \mathbb{E}_\theta(f)$ of $f$, where $\Theta$
resembles the parameters of a family of probability measures on $\Omega$. We
investigate the structure of such relaxations by means of measure theory and
Fourier analysis, enabling us to shed light on the success of many associated
stochastic optimization methods. The main structural traits we derive and that
allow fast and reliable optimization of relaxations are the consistency of
optimal values of $f$, Lipschitzness of gradients, and convexity. We emphasize
settings where $f$ itself is not differentiable or convex, e.g., in the
presence of (stochastic) disturbance.
Related papers
- Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
We focus on the class of (strongly) convex $(L0)$-smooth functions and derive new convergence guarantees for several existing methods.
In particular, we derive improved convergence rates for Gradient Descent with smoothnessed Gradient Clipping and for Gradient Descent with Polyak Stepsizes.
arXiv Detail & Related papers (2024-09-23T13:11:37Z) - Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
We propose an online convex optimization approach with two different levels of adaptivity.
We obtain $mathcalO(log V_T)$, $mathcalO(d log V_T)$ and $hatmathcalO(sqrtV_T)$ regret bounds for strongly convex, exp-concave and convex loss functions.
arXiv Detail & Related papers (2023-07-17T09:55:35Z) - Random Function Descent [0.0]
We show that a'stochastic Taylor' to gradient descent is scalable in high yields.
Specifically, we use a'stochastic Taylor' to gradient descent, which is scalable in high yields.
arXiv Detail & Related papers (2023-05-02T12:53:50Z) - Manifold Free Riemannian Optimization [4.484251538832438]
A principled framework for solving optimization problems with a smooth manifold $mathcalM$ is proposed.
We use a noiseless sample set of the cost function $(x_i, y_i)in mathcalM times mathbbR$ and the intrinsic dimension of the manifold $mathcalM$.
arXiv Detail & Related papers (2022-09-07T16:19:06Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - Non-Convex Optimization with Certificates and Fast Rates Through Kernel
Sums of Squares [68.8204255655161]
We consider potentially non- optimized approximation problems.
In this paper, we propose an algorithm that achieves close to optimal a priori computational guarantees.
arXiv Detail & Related papers (2022-04-11T09:37:04Z) - Generalization in Supervised Learning Through Riemannian Contraction [4.3604518673788135]
In a supervised learning setting, we show that if an metric 0 is contracting in someian rate $lambda, it is uniformly uniformly rate with $math.
The results hold for gradient and deterministic loss surfaces, in both continuous and stable $-time.
They can be shown to be optimal in certain linear settings, such as over Descent$ convex or strongly convex loss surfaces.
arXiv Detail & Related papers (2022-01-17T23:08:47Z) - On Stochastic Moving-Average Estimators for Non-Convex Optimization [105.22760323075008]
In this paper, we demonstrate the power of a widely used estimator based on moving average (SEMA) problems.
For all these-the-art results, we also present the results for all these-the-art problems.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex
Compositional Optimization [20.410012564838933]
compositional optimization arises in many important machine learning tasks such as value function evaluation in reinforcement learning and portfolio management.
In this paper, we investigate the general compositional optimization in the general smooth non-cursive setting.
arXiv Detail & Related papers (2019-12-31T18:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.