Optimizing Optimizers for Fast Gradient-Based Learning
- URL: http://arxiv.org/abs/2512.06370v1
- Date: Sat, 06 Dec 2025 09:50:41 GMT
- Title: Optimizing Optimizers for Fast Gradient-Based Learning
- Authors: Jaerin Lee, Kyoung Mu Lee,
- Abstract summary: We lay the theoretical foundation for automating design in gradient-based learning.<n>By treating gradient loss signals as a function that translates to parameter motions, the problem reduces to a family of convex optimization problems.
- Score: 53.81268610971847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We lay the theoretical foundation for automating optimizer design in gradient-based learning. Based on the greedy principle, we formulate the problem of designing optimizers as maximizing the instantaneous decrease in loss. By treating an optimizer as a function that translates loss gradient signals into parameter motions, the problem reduces to a family of convex optimization problems over the space of optimizers. Solving these problems under various constraints not only recovers a wide range of popular optimizers as closed-form solutions, but also produces the optimal hyperparameters of these optimizers with respect to the problems at hand. This enables a systematic approach to design optimizers and tune their hyperparameters according to the gradient statistics that are collected during the training process. Furthermore, this optimization of optimization can be performed dynamically during training.
Related papers
- Gradient-free neural topology optimization [0.0]
gradient-free algorithms require many more iterations to converge when compared to gradient-based algorithms.
This has made them unviable for topology optimization due to the high computational cost per iteration and high dimensionality of these problems.
We propose a pre-trained neural reparameterization strategy that leads to at least one order of magnitude decrease in iteration count when optimizing the designs in latent space.
arXiv Detail & Related papers (2024-03-07T23:00:49Z) - Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - A Nonstochastic Control Approach to Optimization [26.744354103012448]
We show how recent methods from control preconditions can overcome the challenge of convex nonity.
We can learn a method that attains a similar result in hindsight from a class of methods.
arXiv Detail & Related papers (2023-01-19T06:08:01Z) - Teaching Networks to Solve Optimization Problems [13.803078209630444]
We propose to replace the iterative solvers altogether with a trainable parametric set function.
We show the feasibility of learning such parametric (set) functions to solve various classic optimization problems.
arXiv Detail & Related papers (2022-02-08T19:13:13Z) - Speeding up Computational Morphogenesis with Online Neural Synthetic
Gradients [51.42959998304931]
A wide range of modern science and engineering applications are formulated as optimization problems with a system of partial differential equations (PDEs) as constraints.
These PDE-constrained optimization problems are typically solved in a standard discretize-then-optimize approach.
We propose a general framework to speed up PDE-constrained optimization using online neural synthetic gradients (ONSG) with a novel two-scale optimization scheme.
arXiv Detail & Related papers (2021-04-25T22:43:51Z) - Learning to Optimize Under Constraints with Unsupervised Deep Neural
Networks [0.0]
We propose a machine learning (ML) method to learn how to solve a generic constrained continuous optimization problem.
In this paper, we propose an unsupervised deep learning (DL) solution for solving constrained optimization problems in real-time.
arXiv Detail & Related papers (2021-01-04T02:58:37Z) - Optimizing Optimizers: Regret-optimal gradient descent algorithms [9.89901717499058]
We study the existence, uniqueness and consistency of regret-optimal algorithms.
By providing first-order optimality conditions for the control problem, we show that regret-optimal algorithms must satisfy a specific structure in their dynamics.
We present fast numerical methods for approximating them, generating optimization algorithms which directly optimize their long-term regret.
arXiv Detail & Related papers (2020-12-31T19:13:53Z) - Automatically Learning Compact Quality-aware Surrogates for Optimization
Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values.
Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making.
We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z) - A Primer on Zeroth-Order Optimization in Signal Processing and Machine
Learning [95.85269649177336]
ZO optimization iteratively performs three major steps: gradient estimation, descent direction, and solution update.
We demonstrate promising applications of ZO optimization, such as evaluating and generating explanations from black-box deep learning models, and efficient online sensor management.
arXiv Detail & Related papers (2020-06-11T06:50:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.