Related papers: Why Do Local Methods Solve Nonconvex Problems?

Why Do Local Methods Solve Nonconvex Problems?

URL: http://arxiv.org/abs/2103.13462v1
Date: Wed, 24 Mar 2021 19:34:11 GMT
Title: Why Do Local Methods Solve Nonconvex Problems?
Authors: Tengyu Ma
Abstract summary: Non-used optimization is ubiquitous in modern machine learning. We rigorously formalize it for instances of machine learning problems. We hypothesize a unified explanation for this phenomenon.
Score: 54.284687261929115
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Non-convex optimization is ubiquitous in modern machine learning. Researchers devise non-convex objective functions and optimize them using off-the-shelf optimizers such as stochastic gradient descent and its variants, which leverage the local geometry and update iteratively. Even though solving non-convex functions is NP-hard in the worst case, the optimization quality in practice is often not an issue -- optimizers are largely believed to find approximate global minima. Researchers hypothesize a unified explanation for this intriguing phenomenon: most of the local minima of the practically-used objectives are approximately global minima. We rigorously formalize it for concrete instances of machine learning problems.

Related papers

Langevin Multiplicative Weights Update with Applications in Polynomial Portfolio Management [14.310970006771717]
We show that LMvinvin based gradient local minima with a non-asymptotic convergence analysis. We show that LMvinvin algorithm is provably convergent global minima with a non-asymptotic convergence analysis.
arXiv Detail & Related papers (2025-02-26T15:13:08Z)
Super Gradient Descent: Global Optimization requires Global Gradient [0.0]
This article introduces a novel optimization method that guarantees convergence to the global minimum for any k-Lipschitz function defined on a closed interval. Our approach addresses the limitations of traditional optimization algorithms, which often get trapped in local minima.
arXiv Detail & Related papers (2024-10-25T17:28:39Z)
Review Non-convex Optimization Method for Machine Learning [0.0]
Non-local optimization is a critical tool in advancing machine learning, especially for complex models like deep neural networks and saddle machines. This paper examines methods in non computation and applications of non-local optimization in machine learning.
arXiv Detail & Related papers (2024-10-02T20:34:33Z)
A Particle-based Sparse Gaussian Process Optimizer [5.672919245950197]
We present a new swarm-swarm-based framework utilizing the underlying dynamical process of descent. The biggest advantage of this approach is greater exploration around the current state before deciding descent descent.
arXiv Detail & Related papers (2022-11-26T09:06:15Z)
Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems. We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z)
Fighting the curse of dimensionality: A machine learning approach to finding global optima [77.34726150561087]
This paper shows how to find global optima in structural optimization problems. By exploiting certain cost functions we either obtain the global at best or obtain superior results at worst when compared to established optimization procedures.
arXiv Detail & Related papers (2021-10-28T09:50:29Z)
Combining resampling and reweighting for faithful stochastic optimization [1.52292571922932]
When the loss function is a sum of multiple terms, a popular method is gradient descent. We show that the difference in the Lipschitz constants of multiple terms in the loss function causes gradient descent to different variances at different minimums.
arXiv Detail & Related papers (2021-05-31T04:21:25Z)
Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z)
Community detection using fast low-cardinality semidefinite programming [94.4878715085334]
We propose a new low-cardinality algorithm that generalizes the local update to maximize a semidefinite relaxation derived from Leiden-k-cut. This proposed algorithm is scalable, outperforms state-of-the-art algorithms, and outperforms in real-world time with little additional cost.
arXiv Detail & Related papers (2020-12-04T15:46:30Z)
Newton-type Methods for Minimax Optimization [37.58722381375258]
We propose two novel Newtontype algorithms for non-nonconcave minimax learning. We prove their convergence at strict minimax points, which are sequentials solutions.
arXiv Detail & Related papers (2020-06-25T17:38:00Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem. We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.