When Does Aggregating Multiple Skills with Multi-Task Learning Work? A
Case Study in Financial NLP
- URL: http://arxiv.org/abs/2305.14007v1
- Date: Tue, 23 May 2023 12:37:14 GMT
- Title: When Does Aggregating Multiple Skills with Multi-Task Learning Work? A
Case Study in Financial NLP
- Authors: Jingwei Ni, Zhijing Jin, Qian Wang, Mrinmaya Sachan, Markus Leippold
- Abstract summary: Multi-task learning (MTL) aims at achieving a better model by leveraging data and knowledge from multiple tasks.
Our findings suggest that the key to MTL success lies in skill diversity, relatedness between tasks, and choice of aggregation size and shared capacity.
- Score: 22.6364117325639
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-task learning (MTL) aims at achieving a better model by leveraging data
and knowledge from multiple tasks. However, MTL does not always work --
sometimes negative transfer occurs between tasks, especially when aggregating
loosely related skills, leaving it an open question when MTL works. Previous
studies show that MTL performance can be improved by algorithmic tricks.
However, what tasks and skills should be included is less well explored. In
this work, we conduct a case study in Financial NLP where multiple datasets
exist for skills relevant to the domain, such as numeric reasoning and
sentiment analysis. Due to the task difficulty and data scarcity in the
Financial NLP domain, we explore when aggregating such diverse skills from
multiple datasets with MTL can work. Our findings suggest that the key to MTL
success lies in skill diversity, relatedness between tasks, and choice of
aggregation size and shared capacity. Specifically, MTL works well when tasks
are diverse but related, and when the size of the task aggregation and the
shared capacity of the model are balanced to avoid overwhelming certain tasks.
Related papers
- Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Task Grouping for Automated Multi-Task Machine Learning via Task
Affinity Prediction [7.975047833725489]
Multi-task learning (MTL) models can attain significantly higher accuracy than single-task learning (STL) models.
In this paper, we propose a novel automated approach for task grouping.
We identify inherent task features and STL characteristics that can help us to predict whether a group of tasks should be learned together using MTL or if they should be learned independently using STL.
arXiv Detail & Related papers (2023-10-24T23:29:46Z) - "It's a Match!" -- A Benchmark of Task Affinity Scores for Joint
Learning [74.14961250042629]
Multi-Task Learning (MTL) promises attractive, characterizing the conditions of its success is still an open problem in Deep Learning.
Estimateing task affinity for joint learning is a key endeavor.
Recent work suggests that the training conditions themselves have a significant impact on the outcomes of MTL.
Yet, the literature is lacking a benchmark to assess the effectiveness of tasks affinity estimation techniques.
arXiv Detail & Related papers (2023-01-07T15:16:35Z) - Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners [74.92558307689265]
We propose Mod-Squad, a new model that is Modularized into groups of experts (a 'Squad')
We optimize this matching process during the training of a single model.
Experiments on the Taskonomy dataset with 13 vision tasks and the PASCAL-Context dataset with 5 vision tasks show the superiority of our approach.
arXiv Detail & Related papers (2022-12-15T18:59:52Z) - Improving Multi-Task Generalization via Regularizing Spurious
Correlation [41.93623986464747]
Multi-Task Learning (MTL) is a powerful learning paradigm to improve generalization performance via knowledge sharing.
We propose a framework to represent multi-task knowledge via disentangled neural modules, and learn which module is causally related to each task.
Experiments show that it could enhance MTL model's performance by 5.5% on average over Multi-MNIST, MovieLens, Taskonomy, CityScape, and NYUv2.
arXiv Detail & Related papers (2022-05-19T18:31:54Z) - When to Use Multi-Task Learning vs Intermediate Fine-Tuning for
Pre-Trained Encoder Transfer Learning [15.39115079099451]
Transfer learning (TL) in natural language processing has seen a surge of interest in recent years.
Three main strategies have emerged for making use of multiple supervised datasets during fine-tuning.
We compare all three TL methods in a comprehensive analysis on the GLUE dataset suite.
arXiv Detail & Related papers (2022-05-17T06:48:45Z) - Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning.
We devise task-aware gating functions to route examples from different tasks to specialized experts.
This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z) - Multi-Task Learning as a Bargaining Game [63.49888996291245]
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks.
Since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts.
We propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
arXiv Detail & Related papers (2022-02-02T13:21:53Z) - Distribution Matching for Heterogeneous Multi-Task Learning: a
Large-scale Face Study [75.42182503265056]
Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm.
We deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems.
We build FaceBehaviorNet, the first framework for large-scale face analysis, by jointly learning all facial behavior tasks.
arXiv Detail & Related papers (2021-05-08T22:26:52Z) - Task Uncertainty Loss Reduce Negative Transfer in Asymmetric Multi-task
Feature Learning [0.0]
Multi-task learning (MTL) can improve task performance overall relative to single-task learning (STL), but can hide negative transfer (NT)
Asymmetric multitask feature learning (AMTFL) is an approach that tries to address this by allowing tasks with higher loss values to have smaller influence on feature representations for learning other tasks.
We present examples of NT in two datasets (image recognition and pharmacogenomics) and tackle this challenge by using aleatoric homoscedastic uncertainty to capture the relative confidence between tasks, and set weights for task loss.
arXiv Detail & Related papers (2020-12-17T13:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.