Optimizer Amalgamation
- URL: http://arxiv.org/abs/2203.06474v2
- Date: Tue, 15 Mar 2022 01:17:23 GMT
- Title: Optimizer Amalgamation
- Authors: Tianshu Huang, Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini,
Zhangyang Wang
- Abstract summary: We are motivated to study a new problem named Amalgamation: how can we best combine a pool of "teacher" amalgamations into a single "student" that can have stronger problem-specific performance?
First, we define three differentiable mechanisms to amalgamate a pool of analyticals by gradient descent.
In order to reduce variance of the process, we also explore methods to stabilize the process by perturbing the target.
- Score: 124.33523126363728
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Selecting an appropriate optimizer for a given problem is of major interest
for researchers and practitioners. Many analytical optimizers have been
proposed using a variety of theoretical and empirical approaches; however, none
can offer a universal advantage over other competitive optimizers. We are thus
motivated to study a new problem named Optimizer Amalgamation: how can we best
combine a pool of "teacher" optimizers into a single "student" optimizer that
can have stronger problem-specific performance? In this paper, we draw
inspiration from the field of "learning to optimize" to use a learnable
amalgamation target. First, we define three differentiable amalgamation
mechanisms to amalgamate a pool of analytical optimizers by gradient descent.
Then, in order to reduce variance of the amalgamation process, we also explore
methods to stabilize the amalgamation process by perturbing the amalgamation
target. Finally, we present experiments showing the superiority of our
amalgamated optimizer compared to its amalgamated components and learning to
optimize baselines, and the efficacy of our variance reducing perturbations.
Our code and pre-trained models are publicly available at
http://github.com/VITA-Group/OptimizerAmalgamation.
Related papers
- Improving Existing Optimization Algorithms with LLMs [0.9668407688201361]
This paper investigates how Large Language Models (LLMs) can enhance existing optimization algorithms.
Using their pre-trained knowledge, we demonstrate their ability to propose innovative variations and implementation strategies.
Our results show that an alternative proposed by GPT-4o outperforms the expert-designed of CMSA.
arXiv Detail & Related papers (2025-02-12T10:58:57Z) - Bayesian Optimization with Preference Exploration by Monotonic Neural Network Ensemble [3.004066195320147]
We propose using a neural network ensemble as a utility surrogate model.
This approach naturally integrates monotonicity and supports pairwise comparison data.
An ablation study highlights the critical role of monotonicity in enhancing performance.
arXiv Detail & Related papers (2025-01-30T22:50:34Z) - LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning [69.95292905263393]
We show that gradient-based and high-level LLMs can effectively collaborate a combined optimization framework.
In this paper, we show that these complementary to each other and can effectively collaborate a combined optimization framework.
arXiv Detail & Related papers (2024-05-30T06:24:14Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - An Empirical Evaluation of Zeroth-Order Optimization Methods on
AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives.
We show the advantages of ZO sign-based gradient descent (ZO-signGD)
We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z) - Teaching Networks to Solve Optimization Problems [13.803078209630444]
We propose to replace the iterative solvers altogether with a trainable parametric set function.
We show the feasibility of learning such parametric (set) functions to solve various classic optimization problems.
arXiv Detail & Related papers (2022-02-08T19:13:13Z) - SnAKe: Bayesian Optimization with Pathwise Exploration [9.807656882149319]
We consider a novel setting where the expense of evaluating the function can increase significantly when making large input changes between iterations.
This paper investigates the problem and introduces 'Sequential Bayesian Optimization via Adaptive Connecting Samples' (SnAKe)
It provides a solution by considering future queries and preemptively building optimization paths that minimize input costs.
arXiv Detail & Related papers (2022-01-31T19:42:56Z) - Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients.
We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function.
We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z) - Reverse engineering learned optimizers reveals known and novel
mechanisms [50.50540910474342]
Learneds are algorithms that can themselves be trained to solve optimization problems.
Our results help elucidate the previously murky understanding of how learneds work, and establish tools for interpreting future learneds.
arXiv Detail & Related papers (2020-11-04T07:12:43Z) - Descending through a Crowded Valley - Benchmarking Deep Learning
Optimizers [29.624308090226375]
In this work, we aim to replace these anecdotes, if not with a conclusive ranking, then at least with evidence-backed anecdotes.
To do so, we perform an extensive, standardized benchmark of fifteen particularly popular deep learnings.
Our open-sourced results are available as challenging and well-tuned baselines for more meaningful evaluations of novel optimization methods.
arXiv Detail & Related papers (2020-07-03T08:19:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.