Related papers: Multi-task Active Learning for Pre-trained Transformer-based Models

Multi-task Active Learning for Pre-trained Transformer-based Models

URL: http://arxiv.org/abs/2208.05379v1
Date: Wed, 10 Aug 2022 14:54:13 GMT
Title: Multi-task Active Learning for Pre-trained Transformer-based Models
Authors: Guy Rotman and Roi Reichart
Abstract summary: Multi-task learning, in which several tasks are jointly learned by a single model, allows NLP models to share information from multiple annotations. This technique requires annotating the same text with multiple annotation schemes which may be costly and laborious. Active learning (AL) has been demonstrated to optimize annotation processes by iteratively selecting unlabeled examples.
Score: 22.228551277598804
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-task learning, in which several tasks are jointly learned by a single model, allows NLP models to share information from multiple annotations and may facilitate better predictions when the tasks are inter-related. This technique, however, requires annotating the same text with multiple annotation schemes which may be costly and laborious. Active learning (AL) has been demonstrated to optimize annotation processes by iteratively selecting unlabeled examples whose annotation is most valuable for the NLP model. Yet, multi-task active learning (MT-AL) has not been applied to state-of-the-art pre-trained Transformer-based NLP models. This paper aims to close this gap. We explore various multi-task selection criteria in three realistic multi-task scenarios, reflecting different relations between the participating tasks, and demonstrate the effectiveness of multi-task compared to single-task selection. Our results suggest that MT-AL can be effectively used in order to minimize annotation efforts for multi-task NLP models.

Related papers

Single-Input Multi-Output Model Merging: Leveraging Foundation Models for Dense Multi-Task Learning [46.51245338355645]
Model merging is a flexible and computationally tractable approach to merge single-task checkpoints into a multi-task model. We show that it qualitatively differs from the single-input-multiple-output model merging settings studied in the literature due to the existence of task-specific decoders. We propose two simple and efficient fixes for the SIMO setting to re-align the feature representation after merging.
arXiv Detail & Related papers (2025-04-15T15:10:46Z)
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment [58.94611347128066]
Task Preference Optimization (TPO) is a novel method that utilizes differentiable task preferences derived from typical fine-grained visual tasks. By leveraging rich visual labels during training, TPO significantly enhances the MLLM's multimodal capabilities and task-specific performance. Our instantiation of this approach with VideoChat and LLaVA demonstrates an overall 14.6% improvement in multimodal performance compared to baseline models.
arXiv Detail & Related papers (2024-12-26T18:56:05Z)
Making Small Language Models Better Multi-task Learners with Mixture-of-Task-Adapters [13.6682552098234]
Large Language Models (LLMs) have achieved amazing zero-shot learning performance over a variety of Natural Language Processing (NLP) tasks. We present ALTER, a system that effectively builds the multi-tAsk learners with mixTure-of-task-adaptERs upon small language models. A two-stage training method is proposed to optimize the collaboration between adapters at a small computational cost.
arXiv Detail & Related papers (2023-09-20T03:39:56Z)
Task Selection and Assignment for Multi-modal Multi-task Dialogue Act Classification with Non-stationary Multi-armed Bandits [11.682678945754837]
Multi-task learning (MTL) aims to improve the performance of a primary task by jointly learning with related auxiliary tasks. Previous studies suggest that such a random selection of tasks may not be helpful, and can even be harmful to performance. This paper proposes a method for selecting and assigning tasks based on non-stationary multi-armed bandits.
arXiv Detail & Related papers (2023-09-18T14:51:51Z)
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis. For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z)
Identification of Negative Transfers in Multitask Learning Using Surrogate Models [29.882265735630046]
Multitask learning is widely used to train a low-resource target task by augmenting it with multiple related source tasks. A critical problem in multitask learning is identifying subsets of source tasks that would benefit the target task. We introduce an efficient procedure to address this problem via surrogate modeling.
arXiv Detail & Related papers (2023-03-25T23:16:11Z)
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models [72.8156832931841]
Generalist models are capable of performing diverse multi-modal tasks in a task-agnostic way within a single model. We release a generalist model learning system, OFASys, built on top of a declarative task interface named multi-modal instruction.
arXiv Detail & Related papers (2022-12-08T17:07:09Z)
A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods [17.094426577723507]
Multi-task learning (MTL) has become increasingly popular in natural language processing (NLP) It improves the performance of related tasks by exploiting their commonalities and differences. It is still not understood very well how multi-task learning can be implemented based on the relatedness of training tasks.
arXiv Detail & Related papers (2022-04-07T15:22:19Z)
Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z)
The Effect of Diversity in Meta-Learning [79.56118674435844]
Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples. Recent studies show that task distribution plays a vital role in the model's performance. We study different task distributions on a myriad of models and datasets to evaluate the effect of task diversity on meta-learning algorithms.
arXiv Detail & Related papers (2022-01-27T19:39:07Z)
Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks. We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z)
Low Resource Multi-Task Sequence Tagging -- Revisiting Dynamic Conditional Random Fields [67.51177964010967]
We compare different models for low resource multi-task sequence tagging that leverage dependencies between label sequences for different tasks. We find that explicit modeling of inter-dependencies between task predictions outperforms single-task as well as standard multi-task models.
arXiv Detail & Related papers (2020-05-01T07:11:34Z)
TempLe: Learning Template of Transitions for Sample Efficient Multi-task RL [18.242904106537654]
TempLe is the first PAC-MDP method for multi-task reinforcement learning. We present two algorithms for an "online" and a "finite-model" setting respectively. We prove that our proposed TempLe algorithms achieve much lower sample complexity than single-task learners or state-of-the-art multi-task methods.
arXiv Detail & Related papers (2020-02-16T19:46:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.