Learning Functions to Study the Benefit of Multitask Learning
- URL: http://arxiv.org/abs/2006.05561v2
- Date: Mon, 28 Sep 2020 06:19:12 GMT
- Title: Learning Functions to Study the Benefit of Multitask Learning
- Authors: Gabriele Bettgenh\"auser, Michael A. Hedderich, Dietrich Klakow
- Abstract summary: We study and quantify the generalization patterns of multitask learning (MTL) models for sequence labeling tasks.
Although multitask learning has achieved improved performance in some problems, there are also tasks that lose performance when trained together.
- Score: 25.325601027501836
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study and quantify the generalization patterns of multitask learning (MTL)
models for sequence labeling tasks. MTL models are trained to optimize a set of
related tasks jointly. Although multitask learning has achieved improved
performance in some problems, there are also tasks that lose performance when
trained together. These mixed results motivate us to study the factors that
impact the performance of MTL models. We note that theoretical bounds and
convergence rates for MTL models exist, but they rely on strong assumptions
such as task relatedness and the use of balanced datasets. To remedy these
limitations, we propose the creation of a task simulator and the use of
Symbolic Regression to learn expressions relating model performance to possible
factors of influence. For MTL, we study the model performance against the
number of tasks (T), the number of samples per task (n) and the task
relatedness measured by the adjusted mutual information (AMI). In our
experiments, we could empirically find formulas relating model performance with
factors of sqrt(n), sqrt(T), which are equivalent to sound mathematical proofs
in Maurer[2016], and we went beyond by discovering that performance relates to
a factor of sqrt(AMI).
Related papers
- Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications.
The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard.
We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z) - The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities [51.594836904623534]
We investigate whether instruction-tuned models possess fundamentally different capabilities from base models that are prompted using in-context examples.
We show that the performance of instruction-tuned models is significantly correlated with the in-context performance of their base counterparts.
Specifically, we extend this understanding to instruction-tuned models, suggesting that their pretraining data similarly sets a limiting boundary on the tasks they can solve.
arXiv Detail & Related papers (2025-01-15T10:57:55Z) - MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models.
Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z) - Modeling Output-Level Task Relatedness in Multi-Task Learning with Feedback Mechanism [7.479892725446205]
Multi-task learning (MTL) is a paradigm that simultaneously learns multiple tasks by sharing information at different levels.
We introduce a posteriori information into the model, considering that different tasks may produce correlated outputs with mutual influences.
We achieve this by incorporating a feedback mechanism into MTL models, where the output of one task serves as a hidden feature for another task.
arXiv Detail & Related papers (2024-04-01T03:27:34Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance.
We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices.
We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z) - "It's a Match!" -- A Benchmark of Task Affinity Scores for Joint
Learning [74.14961250042629]
Multi-Task Learning (MTL) promises attractive, characterizing the conditions of its success is still an open problem in Deep Learning.
Estimateing task affinity for joint learning is a key endeavor.
Recent work suggests that the training conditions themselves have a significant impact on the outcomes of MTL.
Yet, the literature is lacking a benchmark to assess the effectiveness of tasks affinity estimation techniques.
arXiv Detail & Related papers (2023-01-07T15:16:35Z) - Cross-Task Consistency Learning Framework for Multi-Task Learning [9.991706230252708]
We propose a new learning framework for 2-task MTL problem.
We define two new loss terms inspired by cycle-consistency loss and contrastive learning.
We theoretically prove that both losses help the model learn more efficiently and that cross-task consistency loss is better in terms of alignment with the straight-forward predictions.
arXiv Detail & Related papers (2021-11-28T11:55:19Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.