Do Current Multi-Task Optimization Methods in Deep Learning Even Help?
- URL: http://arxiv.org/abs/2209.11379v1
- Date: Fri, 23 Sep 2022 02:45:13 GMT
- Title: Do Current Multi-Task Optimization Methods in Deep Learning Even Help?
- Authors: Derrick Xin, Behrooz Ghorbani, Ankush Garg, Orhan Firat, Justin Gilmer
- Abstract summary: We show that, despite the added design and computational complexity of these algorithms, MTO methods do not yield any performance improvements beyond what is achievable via traditional optimization approaches.
We highlight alternative strategies that consistently yield improvements to the performance profile and point out common training pitfalls that might cause suboptimal results.
- Score: 35.27168056803643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent research has proposed a series of specialized optimization algorithms
for deep multi-task models. It is often claimed that these multi-task
optimization (MTO) methods yield solutions that are superior to the ones found
by simply optimizing a weighted average of the task losses. In this paper, we
perform large-scale experiments on a variety of language and vision tasks to
examine the empirical validity of these claims. We show that, despite the added
design and computational complexity of these algorithms, MTO methods do not
yield any performance improvements beyond what is achievable via traditional
optimization approaches. We highlight alternative strategies that consistently
yield improvements to the performance profile and point out common training
pitfalls that might cause suboptimal results. Finally, we outline challenges in
reliably evaluating the performance of MTO algorithms and discuss potential
solutions.
Related papers
- Learning Joint Models of Prediction and Optimization [56.04498536842065]
Predict-Then-Then framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving.
This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by joint predictive models.
arXiv Detail & Related papers (2024-09-07T19:52:14Z) - Unraveling the Versatility and Impact of Multi-Objective Optimization: Algorithms, Applications, and Trends for Solving Complex Real-World Problems [4.023511716339818]
Multi-Objective Optimization (MOO) techniques have become increasingly popular in recent years.
This paper examines recently developed MOO-based algorithms.
In real-world case studies, MOO algorithms address complicated decision-making challenges.
arXiv Detail & Related papers (2024-06-29T15:19:46Z) - Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization [81.88668100203913]
Large language models (LLMs) have demonstrated strong capabilities in solving a wide range of programming tasks.
In this paper, we explore code optimization with a focus on performance enhancement, specifically aiming to optimize code for minimal execution time.
arXiv Detail & Related papers (2024-06-17T16:10:10Z) - Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs.
We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention.
Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z) - Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - An Empirical Evaluation of Zeroth-Order Optimization Methods on
AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives.
We show the advantages of ZO sign-based gradient descent (ZO-signGD)
We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z) - Sample-Efficient, Exploration-Based Policy Optimisation for Routing
Problems [2.6782615615913348]
This paper presents a new reinforcement learning approach that is based on entropy.
In addition, we design an off-policy-based reinforcement learning technique that maximises the expected return.
We show that our model can generalise to various route problems.
arXiv Detail & Related papers (2022-05-31T09:51:48Z) - A survey on multi-objective hyperparameter optimization algorithms for
Machine Learning [62.997667081978825]
This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms.
We distinguish between metaheuristic-based algorithms, metamodel-based algorithms, and approaches using a mixture of both.
We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
arXiv Detail & Related papers (2021-11-23T10:22:30Z) - A novel multiobjective evolutionary algorithm based on decomposition and
multi-reference points strategy [14.102326122777475]
Multiobjective evolutionary algorithm based on decomposition (MOEA/D) has been regarded as a significantly promising approach for solving multiobjective optimization problems (MOPs)
We propose an improved MOEA/D algorithm by virtue of the well-known Pascoletti-Serafini scalarization method and a new strategy of multi-reference points.
arXiv Detail & Related papers (2021-10-27T02:07:08Z) - PAMELI: A Meta-Algorithm for Computationally Expensive Multi-Objective
Optimization Problems [0.0]
The proposed algorithm is based on solving a set of surrogate problems defined by models of the real one.
Our algorithm also performs a meta-search for optimal surrogate models and navigation strategies for the optimization landscape.
arXiv Detail & Related papers (2021-03-19T11:18:03Z) - Meta Learning Black-Box Population-Based Optimizers [0.0]
We propose the use of meta-learning to infer population-based blackbox generalizations.
We show that the meta-loss function encourages a learned algorithm to alter its search behavior so that it can easily fit into a new context.
arXiv Detail & Related papers (2021-03-05T08:13:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.