Non-local Optimization: Imposing Structure on Optimization Problems by
Relaxation
- URL: http://arxiv.org/abs/2011.06064v3
- Date: Sat, 24 Jul 2021 13:20:12 GMT
- Title: Non-local Optimization: Imposing Structure on Optimization Problems by
Relaxation
- Authors: Nils M\"uller and Tobias Glasmachers
- Abstract summary: In evolutionary computation and reinforcement learning, the optimization of a function $f: Omega to mathbbR$ is often addressed through optimizing a so-called relaxation $theta in Theta mapsto mathbbE_theta(f)$ of $f$.
We investigate the structure of such relaxations by means of measure theory and Fourier analysis, enabling us to shed light on the success of many associated optimization methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In stochastic optimization, particularly in evolutionary computation and
reinforcement learning, the optimization of a function $f: \Omega \to
\mathbb{R}$ is often addressed through optimizing a so-called relaxation
$\theta \in \Theta \mapsto \mathbb{E}_\theta(f)$ of $f$, where $\Theta$
resembles the parameters of a family of probability measures on $\Omega$. We
investigate the structure of such relaxations by means of measure theory and
Fourier analysis, enabling us to shed light on the success of many associated
stochastic optimization methods. The main structural traits we derive and that
allow fast and reliable optimization of relaxations are the consistency of
optimal values of $f$, Lipschitzness of gradients, and convexity. We emphasize
settings where $f$ itself is not differentiable or convex, e.g., in the
presence of (stochastic) disturbance.
Related papers
- A Novel Unified Parametric Assumption for Nonconvex Optimization [53.943470475510196]
Non optimization is central to machine learning, but the general framework non convexity enables weak convergence guarantees too pessimistic compared to the other hand.
We introduce a novel unified assumption in non convex algorithms.
arXiv Detail & Related papers (2025-02-17T21:25:31Z) - Optimal Rates for Robust Stochastic Convex Optimization [12.620782629498812]
We develop novel algorithms that achieve minimax-optimal excess risk under the $epsilon$-contamination model.
Our algorithms do not require restrictive assumptions, and can handle nonsmooth but Lipschitz population loss functions.
arXiv Detail & Related papers (2024-12-15T00:52:08Z) - Extended convexity and smoothness and their applications in deep learning [5.281849820329249]
This paper introduces an optimization framework aimed at providing a theoretical foundation for a class of composite optimization problems, particularly those in deep learning.
We analyze the smoothness of Lipschitz's concepts of Lipschitz's descent and descent methods for objective functions that are $mathcalH(Phi)$-smoothness.
arXiv Detail & Related papers (2024-10-08T08:40:07Z) - Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
We focus on the class of (strongly) convex $(L0)$-smooth functions and derive new convergence guarantees for several existing methods.
In particular, we derive improved convergence rates for Gradient Descent with smoothnessed Gradient Clipping and for Gradient Descent with Polyak Stepsizes.
arXiv Detail & Related papers (2024-09-23T13:11:37Z) - Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
We propose an online convex optimization approach with two different levels of adaptivity.
We obtain $mathcalO(log V_T)$, $mathcalO(d log V_T)$ and $hatmathcalO(sqrtV_T)$ regret bounds for strongly convex, exp-concave and convex loss functions.
arXiv Detail & Related papers (2023-07-17T09:55:35Z) - Random Function Descent [0.0]
We show that a'stochastic Taylor' to gradient descent is scalable in high yields.
Specifically, we use a'stochastic Taylor' to gradient descent, which is scalable in high yields.
arXiv Detail & Related papers (2023-05-02T12:53:50Z) - Manifold Free Riemannian Optimization [4.484251538832438]
A principled framework for solving optimization problems with a smooth manifold $mathcalM$ is proposed.
We use a noiseless sample set of the cost function $(x_i, y_i)in mathcalM times mathbbR$ and the intrinsic dimension of the manifold $mathcalM$.
arXiv Detail & Related papers (2022-09-07T16:19:06Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - Non-Convex Optimization with Certificates and Fast Rates Through Kernel
Sums of Squares [68.8204255655161]
We consider potentially non- optimized approximation problems.
In this paper, we propose an algorithm that achieves close to optimal a priori computational guarantees.
arXiv Detail & Related papers (2022-04-11T09:37:04Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.