Related papers: Low Resource Multi-Task Sequence Tagging -- Revisiting Dynamic Conditional Random Fields

Low Resource Multi-Task Sequence Tagging -- Revisiting Dynamic Conditional Random Fields

URL: http://arxiv.org/abs/2005.00250v1
Date: Fri, 1 May 2020 07:11:34 GMT
Title: Low Resource Multi-Task Sequence Tagging -- Revisiting Dynamic Conditional Random Fields
Authors: Jonas Pfeiffer, Edwin Simpson, Iryna Gurevych
Abstract summary: We compare different models for low resource multi-task sequence tagging that leverage dependencies between label sequences for different tasks. We find that explicit modeling of inter-dependencies between task predictions outperforms single-task as well as standard multi-task models.
Score: 67.51177964010967
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We compare different models for low resource multi-task sequence tagging that leverage dependencies between label sequences for different tasks. Our analysis is aimed at datasets where each example has labels for multiple tasks. Current approaches use either a separate model for each task or standard multi-task learning to learn shared feature representations. However, these approaches ignore correlations between label sequences, which can provide important information in settings with small training datasets. To analyze which scenarios can profit from modeling dependencies between labels in different tasks, we revisit dynamic conditional random fields (CRFs) and combine them with deep neural networks. We compare single-task, multi-task and dynamic CRF setups for three diverse datasets at both sentence and document levels in English and German low resource scenarios. We show that including silver labels from pretrained part-of-speech taggers as auxiliary tasks can improve performance on downstream tasks. We find that especially in low-resource scenarios, the explicit modeling of inter-dependencies between task predictions outperforms single-task as well as standard multi-task models.

Related papers

StableMTL: Repurposing Latent Diffusion Models for Multi-Task Learning from Partially Annotated Synthetic Datasets [14.867396697566257]
We extend the partial learning setup to a zero-shot setting, training a multi-task model on multiple datasets, each labeled for only a subset of tasks.<n>Our method, StableMTL, repurposes image generators for latent regression.<n>Instead of per-task losses requiring careful balancing, a unified latent loss is adopted, enabling seamless scaling to more tasks.
arXiv Detail & Related papers (2025-06-09T17:59:59Z)
Single-Input Multi-Output Model Merging: Leveraging Foundation Models for Dense Multi-Task Learning [46.51245338355645]
Model merging is a flexible and computationally tractable approach to merge single-task checkpoints into a multi-task model. We show that it qualitatively differs from the single-input-multiple-output model merging settings studied in the literature due to the existence of task-specific decoders. We propose two simple and efficient fixes for the SIMO setting to re-align the feature representation after merging.
arXiv Detail & Related papers (2025-04-15T15:10:46Z)
An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently. Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z)
Identification of Negative Transfers in Multitask Learning Using Surrogate Models [29.882265735630046]
Multitask learning is widely used to train a low-resource target task by augmenting it with multiple related source tasks. A critical problem in multitask learning is identifying subsets of source tasks that would benefit the target task. We introduce an efficient procedure to address this problem via surrogate modeling.
arXiv Detail & Related papers (2023-03-25T23:16:11Z)
Relational Multi-Task Learning: Modeling Relations between Data and Tasks [84.41620970886483]
Key assumption in multi-task learning is that at the inference time the model only has access to a given data point but not to the data point's labels from other tasks. Here we introduce a novel relational multi-task learning setting where we leverage data point labels from auxiliary tasks to make more accurate predictions. We develop MetaLink, where our key innovation is to build a knowledge graph that connects data points and tasks.
arXiv Detail & Related papers (2023-03-14T07:15:41Z)
Multi-task Active Learning for Pre-trained Transformer-based Models [22.228551277598804]
Multi-task learning, in which several tasks are jointly learned by a single model, allows NLP models to share information from multiple annotations. This technique requires annotating the same text with multiple annotation schemes which may be costly and laborious. Active learning (AL) has been demonstrated to optimize annotation processes by iteratively selecting unlabeled examples.
arXiv Detail & Related papers (2022-08-10T14:54:13Z)
Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning. We devise task-aware gating functions to route examples from different tasks to specialized experts. This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z)
Improving Multi-task Generalization Ability for Neural Text Matching via Prompt Learning [54.66399120084227]
Recent state-of-the-art neural text matching models (PLMs) are hard to generalize to different tasks. We adopt a specialization-generalization training strategy and refer to it as Match-Prompt. In specialization stage, descriptions of different matching tasks are mapped to only a few prompt tokens. In generalization stage, text matching model explores the essential matching signals by being trained on diverse multiple matching tasks.
arXiv Detail & Related papers (2022-04-06T11:01:08Z)
Rethinking Hard-Parameter Sharing in Multi-Task Learning [20.792654758645302]
Hard parameter sharing in multi-task learning (MTL) allows tasks to share some of model parameters, reducing storage cost and improving prediction accuracy. The common sharing practice is to share bottom layers of a deep neural network among tasks while using separate top layers for each task. Using separate bottom-layer parameters could achieve significantly better performance than the common practice.
arXiv Detail & Related papers (2021-07-23T17:26:40Z)
Exploring Relational Context for Multi-Task Dense Prediction [76.86090370115]
We consider a multi-task environment for dense prediction tasks, represented by a common backbone and independent task-specific heads. We explore various attention-based contexts, such as global and local, in the multi-task setting. We propose an Adaptive Task-Relational Context module, which samples the pool of all available contexts for each task pair.
arXiv Detail & Related papers (2021-04-28T16:45:56Z)
Modelling Latent Skills for Multitask Language Generation [15.126163032403811]
We present a generative model for multitask conditional language generation. Our guiding hypothesis is that a shared set of latent skills underlies many disparate language generation tasks. We instantiate this task embedding space as a latent variable in a latent variable sequence-to-sequence model.
arXiv Detail & Related papers (2020-02-21T20:39:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.