Low Resource Multi-Task Sequence Tagging -- Revisiting Dynamic
Conditional Random Fields
- URL: http://arxiv.org/abs/2005.00250v1
- Date: Fri, 1 May 2020 07:11:34 GMT
- Title: Low Resource Multi-Task Sequence Tagging -- Revisiting Dynamic
Conditional Random Fields
- Authors: Jonas Pfeiffer, Edwin Simpson, Iryna Gurevych
- Abstract summary: We compare different models for low resource multi-task sequence tagging that leverage dependencies between label sequences for different tasks.
We find that explicit modeling of inter-dependencies between task predictions outperforms single-task as well as standard multi-task models.
- Score: 67.51177964010967
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We compare different models for low resource multi-task sequence tagging that
leverage dependencies between label sequences for different tasks. Our analysis
is aimed at datasets where each example has labels for multiple tasks. Current
approaches use either a separate model for each task or standard multi-task
learning to learn shared feature representations. However, these approaches
ignore correlations between label sequences, which can provide important
information in settings with small training datasets. To analyze which
scenarios can profit from modeling dependencies between labels in different
tasks, we revisit dynamic conditional random fields (CRFs) and combine them
with deep neural networks. We compare single-task, multi-task and dynamic CRF
setups for three diverse datasets at both sentence and document levels in
English and German low resource scenarios. We show that including silver labels
from pretrained part-of-speech taggers as auxiliary tasks can improve
performance on downstream tasks. We find that especially in low-resource
scenarios, the explicit modeling of inter-dependencies between task predictions
outperforms single-task as well as standard multi-task models.
Related papers
- An Efficient General-Purpose Modular Vision Model via Multi-Task
Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently.
Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z) - Identification of Negative Transfers in Multitask Learning Using
Surrogate Models [29.882265735630046]
Multitask learning is widely used to train a low-resource target task by augmenting it with multiple related source tasks.
A critical problem in multitask learning is identifying subsets of source tasks that would benefit the target task.
We introduce an efficient procedure to address this problem via surrogate modeling.
arXiv Detail & Related papers (2023-03-25T23:16:11Z) - Relational Multi-Task Learning: Modeling Relations between Data and
Tasks [84.41620970886483]
Key assumption in multi-task learning is that at the inference time the model only has access to a given data point but not to the data point's labels from other tasks.
Here we introduce a novel relational multi-task learning setting where we leverage data point labels from auxiliary tasks to make more accurate predictions.
We develop MetaLink, where our key innovation is to build a knowledge graph that connects data points and tasks.
arXiv Detail & Related papers (2023-03-14T07:15:41Z) - Multi-task Active Learning for Pre-trained Transformer-based Models [22.228551277598804]
Multi-task learning, in which several tasks are jointly learned by a single model, allows NLP models to share information from multiple annotations.
This technique requires annotating the same text with multiple annotation schemes which may be costly and laborious.
Active learning (AL) has been demonstrated to optimize annotation processes by iteratively selecting unlabeled examples.
arXiv Detail & Related papers (2022-08-10T14:54:13Z) - Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning.
We devise task-aware gating functions to route examples from different tasks to specialized experts.
This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z) - Improving Multi-task Generalization Ability for Neural Text Matching via
Prompt Learning [54.66399120084227]
Recent state-of-the-art neural text matching models (PLMs) are hard to generalize to different tasks.
We adopt a specialization-generalization training strategy and refer to it as Match-Prompt.
In specialization stage, descriptions of different matching tasks are mapped to only a few prompt tokens.
In generalization stage, text matching model explores the essential matching signals by being trained on diverse multiple matching tasks.
arXiv Detail & Related papers (2022-04-06T11:01:08Z) - Rethinking Hard-Parameter Sharing in Multi-Task Learning [20.792654758645302]
Hard parameter sharing in multi-task learning (MTL) allows tasks to share some of model parameters, reducing storage cost and improving prediction accuracy.
The common sharing practice is to share bottom layers of a deep neural network among tasks while using separate top layers for each task.
Using separate bottom-layer parameters could achieve significantly better performance than the common practice.
arXiv Detail & Related papers (2021-07-23T17:26:40Z) - Exploring Relational Context for Multi-Task Dense Prediction [76.86090370115]
We consider a multi-task environment for dense prediction tasks, represented by a common backbone and independent task-specific heads.
We explore various attention-based contexts, such as global and local, in the multi-task setting.
We propose an Adaptive Task-Relational Context module, which samples the pool of all available contexts for each task pair.
arXiv Detail & Related papers (2021-04-28T16:45:56Z) - Modelling Latent Skills for Multitask Language Generation [15.126163032403811]
We present a generative model for multitask conditional language generation.
Our guiding hypothesis is that a shared set of latent skills underlies many disparate language generation tasks.
We instantiate this task embedding space as a latent variable in a latent variable sequence-to-sequence model.
arXiv Detail & Related papers (2020-02-21T20:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.