Improving Multi-Task Generalization via Regularizing Spurious
Correlation
- URL: http://arxiv.org/abs/2205.09797v1
- Date: Thu, 19 May 2022 18:31:54 GMT
- Title: Improving Multi-Task Generalization via Regularizing Spurious
Correlation
- Authors: Ziniu Hu and Zhe Zhao and Xinyang Yi and Tiansheng Yao and Lichan Hong
and Yizhou Sun and Ed H. Chi
- Abstract summary: Multi-Task Learning (MTL) is a powerful learning paradigm to improve generalization performance via knowledge sharing.
We propose a framework to represent multi-task knowledge via disentangled neural modules, and learn which module is causally related to each task.
Experiments show that it could enhance MTL model's performance by 5.5% on average over Multi-MNIST, MovieLens, Taskonomy, CityScape, and NYUv2.
- Score: 41.93623986464747
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Multi-Task Learning (MTL) is a powerful learning paradigm to improve
generalization performance via knowledge sharing. However, existing studies
find that MTL could sometimes hurt generalization, especially when two tasks
are less correlated. One possible reason that hurts generalization is spurious
correlation, i.e., some knowledge is spurious and not causally related to task
labels, but the model could mistakenly utilize them and thus fail when such
correlation changes. In MTL setup, there exist several unique challenges of
spurious correlation. First, the risk of having non-causal knowledge is higher,
as the shared MTL model needs to encode all knowledge from different tasks, and
causal knowledge for one task could be potentially spurious to the other.
Second, the confounder between task labels brings in a different type of
spurious correlation to MTL. We theoretically prove that MTL is more prone to
taking non-causal knowledge from other tasks than single-task learning, and
thus generalize worse. To solve this problem, we propose Multi-Task Causal
Representation Learning framework, aiming to represent multi-task knowledge via
disentangled neural modules, and learn which module is causally related to each
task via MTL-specific invariant regularization. Experiments show that it could
enhance MTL model's performance by 5.5% on average over Multi-MNIST, MovieLens,
Taskonomy, CityScape, and NYUv2, via alleviating spurious correlation problem.
Related papers
- Can Optimization Trajectories Explain Multi-Task Transfer? [19.797036312370185]
We study how multi-task learning affects generalization in deep learning.
We find that MTL results in a generalization gap-a gap in generalization at comparable training loss-between single-task and multi-task trajectories.
Our work sheds light on the underlying causes for failures in MTL and, importantly, raises questions about the role of general purpose multi-task optimization algorithms.
arXiv Detail & Related papers (2024-08-26T22:57:01Z) - A Unified Causal View of Instruction Tuning [76.1000380429553]
We develop a meta Structural Causal Model (meta-SCM) to integrate different NLP tasks under a single causal structure of the data.
Key idea is to learn task-required causal factors and only use those to make predictions for a given task.
arXiv Detail & Related papers (2024-02-09T07:12:56Z) - Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance.
We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices.
We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z) - When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review [7.776434991976473]
Multi-Task Learning (MTL) aims to learn multiple tasks simultaneously while exploiting their mutual relationships.
This review focuses on how MTL could be utilised under different partial supervision settings to address these challenges.
arXiv Detail & Related papers (2023-07-25T20:08:41Z) - When Does Aggregating Multiple Skills with Multi-Task Learning Work? A
Case Study in Financial NLP [22.6364117325639]
Multi-task learning (MTL) aims at achieving a better model by leveraging data and knowledge from multiple tasks.
Our findings suggest that the key to MTL success lies in skill diversity, relatedness between tasks, and choice of aggregation size and shared capacity.
arXiv Detail & Related papers (2023-05-23T12:37:14Z) - M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
Learning with Model-Accelerator Co-design [95.41238363769892]
Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often lets those tasks learn better jointly.
Current MTL regimes have to activate nearly the entire model even to just execute a single task.
We present a model-accelerator co-design framework to enable efficient on-device MTL.
arXiv Detail & Related papers (2022-10-26T15:40:24Z) - Multi-Task Learning as a Bargaining Game [63.49888996291245]
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks.
Since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts.
We propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
arXiv Detail & Related papers (2022-02-02T13:21:53Z) - Distribution Matching for Heterogeneous Multi-Task Learning: a
Large-scale Face Study [75.42182503265056]
Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm.
We deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems.
We build FaceBehaviorNet, the first framework for large-scale face analysis, by jointly learning all facial behavior tasks.
arXiv Detail & Related papers (2021-05-08T22:26:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.