Related papers: Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings

Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings

URL: http://arxiv.org/abs/2506.08428v1
Date: Tue, 10 Jun 2025 04:03:59 GMT
Title: Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings
Authors: Evan Markou, Thalaiyasingam Ajanthan, Stephen Gould,
Abstract summary: We show that well-designed reduction mappings improve curvature properties of the objective, leading to better-conditioned problems and theoretically faster convergence for gradient-based methods.<n>Our analysis unifies a range of scenarios where structural information at optimality is leveraged to accelerate convergence, offering a principled explanation for the empirical gains observed in such optimisation algorithms.
Score: 34.309687104447114
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many high-dimensional optimisation problems exhibit rich geometric structures in their set of minimisers, often forming smooth manifolds due to over-parametrisation or symmetries. When this structure is known, at least locally, it can be exploited through reduction mappings that reparametrise part of the parameter space to lie on the solution manifold. These reductions naturally arise from inner optimisation problems and effectively remove redundant directions, yielding a lower-dimensional objective. In this work, we introduce a general framework to understand how such reductions influence the optimisation landscape. We show that well-designed reduction mappings improve curvature properties of the objective, leading to better-conditioned problems and theoretically faster convergence for gradient-based methods. Our analysis unifies a range of scenarios where structural information at optimality is leveraged to accelerate convergence, offering a principled explanation for the empirical gains observed in such optimisation algorithms.

Related papers

Designing Algorithms for Entropic Optimal Transport from an Optimisation Perspective [12.343553053539976]
We develop a collection of novel methods for the entropic-regularised optimal transport problem.<n>We are inspired by existing mirror interpretations of the Sinkhorn problem.<n>The broader we develop based on over the joint distributions also finds an analogue in the Schr"odinger bridge problem.
arXiv Detail & Related papers (2025-07-16T13:56:11Z)
Riemannian Optimization on the Oblique Manifold for Sparse Simplex Constraints via Multiplicative Updates [0.0]
Low-rank optimization problems with sparse simplex constraints involve variables that must satisfy nonnegativity, sparsity, and sum-to-one conditions.<n>We propose a novel manifold optimization approach to tackle these problems efficiently.
arXiv Detail & Related papers (2025-03-31T13:31:05Z)
Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale [33.38543010618118]
Zeroth-order (ZO) optimization has emerged as a promising alternative to gradient-based backpropagation methods.<n>We show that high dimensionality is the primary bottleneck and introduce the notion of textitsubspace alignment to explain how the subspace perturbations reduce gradient noise and accelerate convergence.<n>We propose an efficient ZO method using block coordinate descent (MeZO-BCD), which perturbs and updates only a subset of parameters at each step.
arXiv Detail & Related papers (2025-01-31T12:46:04Z)
Gradient-free neural topology optimization [0.0]
gradient-free algorithms require many more iterations to converge when compared to gradient-based algorithms. This has made them unviable for topology optimization due to the high computational cost per iteration and high dimensionality of these problems. We propose a pre-trained neural reparameterization strategy that leads to at least one order of magnitude decrease in iteration count when optimizing the designs in latent space.
arXiv Detail & Related papers (2024-03-07T23:00:49Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
Prescriptive PCA: Dimensionality Reduction for Two-stage Stochastic Optimization [1.1612308609123565]
We develop a prescriptive dimensionality reduction framework that aims to minimize the degree of suboptimality in the optimization phase. For the case where the downstream optimization problem has an expected value objective, we show that prescriptive dimensionality reduction can be performed via solving a distributionally-robust optimization problem. Our approach significantly outperforms principal component analysis with real and synthetic data sets.
arXiv Detail & Related papers (2023-06-04T00:50:35Z)
Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets. We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z)
Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks. One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver. This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z)
Geometric Methods for Sampling, Optimisation, Inference and Adaptive Agents [102.42623636238399]
We identify fundamental geometric structures that underlie the problems of sampling, optimisation, inference and adaptive decision-making. We derive algorithms that exploit these geometric structures to solve these problems efficiently.
arXiv Detail & Related papers (2022-03-20T16:23:17Z)
Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems. Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z)
Unified Convergence Analysis for Adaptive Optimization with Moving Average Estimator [75.05106948314956]
We show that an increasing large momentum parameter for the first-order moment is sufficient for adaptive scaling.<n>We also give insights for increasing the momentum in a stagewise manner in accordance with stagewise decreasing step size.
arXiv Detail & Related papers (2021-04-30T08:50:24Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.