Accelerating Optimization via Differentiable Stopping Time
- URL: http://arxiv.org/abs/2505.22509v1
- Date: Wed, 28 May 2025 15:59:13 GMT
- Title: Accelerating Optimization via Differentiable Stopping Time
- Authors: Zhonglin Xie, Yiman Fong, Haoran Yuan, Zaiwen Wen,
- Abstract summary: A common formulation is achieving a lower loss at a given time.<n>Its dual, minimizing the time to reach a target loss, is believed to be non-differentiable, as the time is not differentiable.<n>We propose a differentiable stopping time and theoretically justify it based on differential equations.
- Score: 5.043563227694138
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optimization is an important module of modern machine learning applications. Tremendous efforts have been made to accelerate optimization algorithms. A common formulation is achieving a lower loss at a given time. This enables a differentiable framework with respect to the algorithm hyperparameters. In contrast, its dual, minimizing the time to reach a target loss, is believed to be non-differentiable, as the time is not differentiable. As a result, it usually serves as a conceptual framework or is optimized using zeroth-order methods. To address this limitation, we propose a differentiable stopping time and theoretically justify it based on differential equations. An efficient algorithm is designed to backpropagate through it. As a result, the proposed differentiable stopping time enables a new differentiable formulation for accelerating algorithms. We further discuss its applications, such as online hyperparameter tuning and learning to optimize. Our proposed methods show superior performance in comprehensive experiments across various problems, which confirms their effectiveness.
Related papers
- DiffIM: Differentiable Influence Minimization with Surrogate Modeling and Continuous Relaxation [23.06479920145709]
Influence minimization (IMIN) is the problem of manipulating the structures of an input graph to reduce the propagation among nodes.<n>We propose DiffIM, a novel method for IMIN with two differentiable schemes for acceleration.<n>We show that each proposed scheme significantly improves speed with little (or even no) IMIN performance degradation.
arXiv Detail & Related papers (2025-02-03T03:54:23Z) - Gradient Descent Efficiency Index [0.0]
This study introduces a new efficiency metric, Ek, designed to quantify the effectiveness of each iteration.
The proposed metric accounts for both the relative change in error and the stability of the loss function across iterations.
Ek has the potential to guide more informed decisions in the selection and tuning of optimization algorithms in machine learning applications.
arXiv Detail & Related papers (2024-10-25T10:22:22Z) - Metric Learning to Accelerate Convergence of Operator Splitting Methods for Differentiable Parametric Programming [46.26499759722771]
This paper shows how differentiable optimization can enable the end-to-end learning of proximal metrics.
Results illustrate a strong connection between the learned proximal metrics and active constraints at the optima.
arXiv Detail & Related papers (2024-04-01T03:23:43Z) - Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - Locally Regularized Neural Differential Equations: Some Black Boxes Were
Meant to Remain Closed! [3.222802562733787]
Implicit layer deep learning techniques, like Neural Differential Equations, have become an important modeling framework.
We develop two sampling strategies to trade off between performance and training time.
Our method reduces the number of function evaluations to 0.556-0.733x and accelerates predictions by 1.3-2x.
arXiv Detail & Related papers (2023-03-03T23:31:15Z) - Accelerated First-Order Optimization under Nonlinear Constraints [61.98523595657983]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.<n>An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Efficient and Modular Implicit Differentiation [68.74748174316989]
We propose a unified, efficient and modular approach for implicit differentiation of optimization problems.
We show that seemingly simple principles allow to recover many recently proposed implicit differentiation methods and create new ones easily.
arXiv Detail & Related papers (2021-05-31T17:45:58Z) - Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients.
We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function.
We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z) - Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant.
Our approach relies on perturbeds, and can be used readily together with existing solvers.
We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.