An Information-theoretic Multi-task Representation Learning Framework for Natural Language Understanding
- URL: http://arxiv.org/abs/2503.04667v1
- Date: Thu, 06 Mar 2025 17:59:51 GMT
- Title: An Information-theoretic Multi-task Representation Learning Framework for Natural Language Understanding
- Authors: Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu,
- Abstract summary: This paper proposes a new principled multi-task representation learning framework (InfoMTL)<n>It ensures sufficiency of shared representations for all tasks and mitigates the negative effect of redundant features.<n>It can enhance language understanding of pre-trained language models (PLMs) under the multi-task paradigm.
- Score: 29.36409607847339
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a new principled multi-task representation learning framework (InfoMTL) to extract noise-invariant sufficient representations for all tasks. It ensures sufficiency of shared representations for all tasks and mitigates the negative effect of redundant features, which can enhance language understanding of pre-trained language models (PLMs) under the multi-task paradigm. Firstly, a shared information maximization principle is proposed to learn more sufficient shared representations for all target tasks. It can avoid the insufficiency issue arising from representation compression in the multi-task paradigm. Secondly, a task-specific information minimization principle is designed to mitigate the negative effect of potential redundant features in the input for each task. It can compress task-irrelevant redundant information and preserve necessary information relevant to the target for multi-task prediction. Experiments on six classification benchmarks show that our method outperforms 12 comparative multi-task methods under the same multi-task settings, especially in data-constrained and noisy scenarios. Extensive experiments demonstrate that the learned representations are more sufficient, data-efficient, and robust.
Related papers
- Leveraging knowledge distillation for partial multi-task learning from multiple remote sensing datasets [2.1178416840822023]
Partial multi-task learning where training examples are annotated for one of the target tasks is a promising idea in remote sensing.
This paper proposes using knowledge distillation to replace the need of ground truths for the alternate task and enhance the performance of such approach.
arXiv Detail & Related papers (2024-05-24T09:48:50Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Factorized Contrastive Learning: Going Beyond Multi-view Redundancy [116.25342513407173]
This paper proposes FactorCL, a new multimodal representation learning method to go beyond multi-view redundancy.
On large-scale real-world datasets, FactorCL captures both shared and unique information and achieves state-of-the-art results.
arXiv Detail & Related papers (2023-06-08T15:17:04Z) - Leveraging sparse and shared feature activations for disentangled
representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation.
We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z) - Multi-task Bias-Variance Trade-off Through Functional Constraints [102.64082402388192]
Multi-task learning aims to acquire a set of functions that perform well for diverse tasks.
In this paper we draw intuition from the two extreme learning scenarios -- a single function for all tasks, and a task-specific function that ignores the other tasks.
We introduce a constrained learning formulation that enforces domain specific solutions to a central function.
arXiv Detail & Related papers (2022-10-27T16:06:47Z) - Compressed Hierarchical Representations for Multi-Task Learning and Task
Clustering [5.878411350387833]
We frame homogeneous-feature multi-task learning as a hierarchical representation learning problem.
We assume an additive independent noise model between the task-agnostic and task-specific latent representations.
It is shown that our resulting representations yield competitive performance for several MTL benchmarks.
arXiv Detail & Related papers (2022-05-31T15:31:17Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Context-Aware Multi-Task Learning for Traffic Scene Recognition in
Autonomous Vehicles [10.475998113861895]
We propose an algorithm to jointly learn the task-specific and shared representations by adopting a multi-task learning network.
Experiments on the large-scale dataset HSD demonstrate the effectiveness and superiority of our network over state-of-the-art methods.
arXiv Detail & Related papers (2020-04-03T03:09:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.