Learning to Optimize with Dynamic Mode Decomposition
- URL: http://arxiv.org/abs/2211.16268v1
- Date: Tue, 29 Nov 2022 14:55:59 GMT
- Title: Learning to Optimize with Dynamic Mode Decomposition
- Authors: Petr \v{S}im\'anek, Daniel Va\v{s}ata, Pavel Kord\'ik
- Abstract summary: We show how to utilize the dynamic mode decomposition method for extracting informative features about optimization dynamics.
We show that our learned generalizes much better to unseen optimization problems in short.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Designing faster optimization algorithms is of ever-growing interest. In
recent years, learning to learn methods that learn how to optimize demonstrated
very encouraging results. Current approaches usually do not effectively include
the dynamics of the optimization process during training. They either omit it
entirely or only implicitly assume the dynamics of an isolated parameter. In
this paper, we show how to utilize the dynamic mode decomposition method for
extracting informative features about optimization dynamics. By employing those
features, we show that our learned optimizer generalizes much better to unseen
optimization problems in short. The improved generalization is illustrated on
multiple tasks where training the optimizer on one neural network generalizes
to different architectures and distinct datasets.
Related papers
- Narrowing the Focus: Learned Optimizers for Pretrained Models [24.685918556547055]
We propose a novel technique that learns a layer-specific linear combination of update directions provided by a set of base work tasks.
When evaluated on an image, this specialized significantly outperforms both traditional off-the-shelf methods such as Adam, as well existing general learneds.
arXiv Detail & Related papers (2024-08-17T23:55:19Z) - Two Optimizers Are Better Than One: LLM Catalyst Empowers Gradient-Based Optimization for Prompt Tuning [69.95292905263393]
We show that gradient-based optimization and large language models (MsLL) are complementary to each other, suggesting a collaborative optimization approach.
Our code is released at https://www.guozix.com/guozix/LLM-catalyst.
arXiv Detail & Related papers (2024-05-30T06:24:14Z) - Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - Investigation into the Training Dynamics of Learned Optimizers [0.0]
We look at the concept of learneds as a way to accelerate the optimization process by replacing traditional, hand-crafted algorithms with meta-learned functions.
Our work examines their optimization from the perspective of network architecture symmetries and update parameters.
We identify several key insights that demonstrate how each approach can benefit from the strengths of the other.
arXiv Detail & Related papers (2023-12-12T11:18:43Z) - A Survey on Multi-Objective based Parameter Optimization for Deep
Learning [1.3223682837381137]
We focus on exploring the effectiveness of multi-objective optimization strategies for parameter optimization in conjunction with deep neural networks.
The two methods are combined to provide valuable insights into the generation of predictions and analysis in multiple applications.
arXiv Detail & Related papers (2023-05-17T07:48:54Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles.
We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates.
We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z) - A Closer Look at Learned Optimization: Stability, Robustness, and
Inductive Biases [44.01339030872185]
Blackbox learneds often struggle with stability and generalization when applied to tasks unlike those in their meta-training set.
We investigate the inductive biases and stability properties of optimization algorithms, and apply the resulting insights to designing inductive biases for blackboxs.
We learn to a variety of neural network training tasks, where it outperforms the current state of the art learned.
arXiv Detail & Related papers (2022-09-22T17:47:21Z) - Reverse engineering learned optimizers reveals known and novel
mechanisms [50.50540910474342]
Learneds are algorithms that can themselves be trained to solve optimization problems.
Our results help elucidate the previously murky understanding of how learneds work, and establish tools for interpreting future learneds.
arXiv Detail & Related papers (2020-11-04T07:12:43Z) - Tasks, stability, architecture, and compute: Training more effective
learned optimizers, and using them to train themselves [53.37905268850274]
We introduce a new, hierarchical, neural network parameterized, hierarchical with access to additional features such as validation loss to enable automatic regularization.
Most learneds have been trained on only a single task, or a small number of tasks.
We train ours on thousands of tasks, making use of orders of magnitude more compute, resulting in generalizes that perform better to unseen tasks.
arXiv Detail & Related papers (2020-09-23T16:35:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.