Related papers: AGGLIO: Global Optimization for Locally Convex Functions

AGGLIO: Global Optimization for Locally Convex Functions

URL: http://arxiv.org/abs/2111.03932v1
Date: Sat, 6 Nov 2021 18:15:56 GMT
Title: AGGLIO: Global Optimization for Locally Convex Functions
Authors: Debojyoti Dey and Bhaskar Mukhoty and Purushottam Kar
Abstract summary: This paper presents AGG (Accelerated Optimization Generalized LInear-model) a stage-wise, global technique that offers provable convergence problems. AGG can be readily implemented using point as A-batch SGD updates and offers provable convergence as well as convergent experiments.
Score: 5.221860952360943
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents AGGLIO (Accelerated Graduated Generalized LInear-model Optimization), a stage-wise, graduated optimization technique that offers global convergence guarantees for non-convex optimization problems whose objectives offer only local convexity and may fail to be even quasi-convex at a global scale. In particular, this includes learning problems that utilize popular activation functions such as sigmoid, softplus and SiLU that yield non-convex training objectives. AGGLIO can be readily implemented using point as well as mini-batch SGD updates and offers provable convergence to the global optimum in general conditions. In experiments, AGGLIO outperformed several recently proposed optimization techniques for non-convex and locally convex objectives in terms of convergence rate as well as convergent accuracy. AGGLIO relies on a graduation technique for generalized linear models, as well as a novel proof strategy, both of which may be of independent interest.

Related papers

Optimizers Qualitatively Alter Solutions And We Should Leverage This [62.662640460717476]
Deep Neural Networks (DNNs) can not guarantee convergence to a unique global minimum of the loss when using only local information, such as SGD.<n>We argue that the community should aim at understanding the biases of already existing methods, as well as aim to build new DNNs with the explicit intent of inducing certain properties of the solution.
arXiv Detail & Related papers (2025-07-16T13:33:31Z)
Langevin Multiplicative Weights Update with Applications in Polynomial Portfolio Management [14.310970006771717]
We show that LMvinvin based gradient local minima with a non-asymptotic convergence analysis. We show that LMvinvin algorithm is provably convergent global minima with a non-asymptotic convergence analysis.
arXiv Detail & Related papers (2025-02-26T15:13:08Z)
A Novel Unified Parametric Assumption for Nonconvex Optimization [53.943470475510196]
Non optimization is central to machine learning, but the general framework non convexity enables weak convergence guarantees too pessimistic compared to the other hand. We introduce a novel unified assumption in non convex algorithms.
arXiv Detail & Related papers (2025-02-17T21:25:31Z)
Super Gradient Descent: Global Optimization requires Global Gradient [0.0]
This article introduces a novel optimization method that guarantees convergence to the global minimum for any k-Lipschitz function defined on a closed interval. Our approach addresses the limitations of traditional optimization algorithms, which often get trapped in local minima.
arXiv Detail & Related papers (2024-10-25T17:28:39Z)
Localized Zeroth-Order Prompt Optimization [54.964765668688806]
We propose a novel algorithm, namely localized zeroth-order prompt optimization (ZOPO) ZOPO incorporates a Neural Tangent Kernel-based derived Gaussian process into standard zeroth-order optimization for an efficient search of well-performing local optima in prompt optimization. Remarkably, ZOPO outperforms existing baselines in terms of both the optimization performance and the query efficiency.
arXiv Detail & Related papers (2024-03-05T14:18:15Z)
ProGO: Probabilistic Global Optimizer [9.772380490791635]
In this paper we develop an algorithm that converges to the global optima under some mild conditions. We show that the proposed algorithm outperforms, by order of magnitude, many existing state-of-the-art methods.
arXiv Detail & Related papers (2023-10-04T22:23:40Z)
Deterministic Langevin Unconstrained Optimization with Normalizing Flows [3.988614978933934]
We introduce a global, free surrogate optimization strategy for black-box functions inspired by the Fokker-Planck and Langevin equations. We demonstrate superior competitive progress toward objective optima on standard synthetic test functions.
arXiv Detail & Related papers (2023-10-01T17:46:20Z)
Learning Regions of Interest for Bayesian Optimization with Adaptive Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest. We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z)
Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape [59.841889495864386]
In federated learning (FL), a cluster of local clients are chaired under the coordination of a global server. Clients are prone to overfit into their own optima, which extremely deviates from the global objective. ttfamily FedSMOO adopts a dynamic regularizer to guarantee the local optima towards the global objective. Our theoretical analysis indicates that ttfamily FedSMOO achieves fast $mathcalO (1/T)$ convergence rate with low bound generalization.
arXiv Detail & Related papers (2023-05-19T10:47:44Z)
Fighting the curse of dimensionality: A machine learning approach to finding global optima [77.34726150561087]
This paper shows how to find global optima in structural optimization problems. By exploiting certain cost functions we either obtain the global at best or obtain superior results at worst when compared to established optimization procedures.
arXiv Detail & Related papers (2021-10-28T09:50:29Z)
Why Do Local Methods Solve Nonconvex Problems? [54.284687261929115]
Non-used optimization is ubiquitous in modern machine learning. We rigorously formalize it for instances of machine learning problems. We hypothesize a unified explanation for this phenomenon.
arXiv Detail & Related papers (2021-03-24T19:34:11Z)
An adaptive stochastic gradient-free approach for high-dimensional blackbox optimization [0.0]
We propose an adaptive gradient-free (ASGF) approach for high-dimensional non-smoothing problems. We illustrate the performance of this method on benchmark global problems and learning tasks.
arXiv Detail & Related papers (2020-06-18T22:47:58Z)
Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points. The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding. In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)
Learning to be Global Optimizer [28.88646928299302]
We learn an optimal network and escaping capability algorithm for some benchmark functions. We show that the learned algorithm significantly outperforms some well-known classical optimization algorithms.
arXiv Detail & Related papers (2020-03-10T03:46:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.