Friend-training: Learning from Models of Different but Related Tasks
- URL: http://arxiv.org/abs/2301.13683v1
- Date: Tue, 31 Jan 2023 15:00:56 GMT
- Title: Friend-training: Learning from Models of Different but Related Tasks
- Authors: Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Xiabing Zhou and Dong
Yu
- Abstract summary: Friend-training is a cross-task self-training framework.
Models trained to do different tasks are used in an iterative training, pseudo-labeling, and retraining process.
We show that the models trained with the friend-training framework achieve the best performance compared to strong baselines.
- Score: 44.25961408685873
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current self-training methods such as standard self-training, co-training,
tri-training, and others often focus on improving model performance on a single
task, utilizing differences in input features, model architectures, and
training processes. However, many tasks in natural language processing are
about different but related aspects of language, and models trained for one
task can be great teachers for other related tasks. In this work, we propose
friend-training, a cross-task self-training framework, where models trained to
do different tasks are used in an iterative training, pseudo-labeling, and
retraining process to help each other for better selection of pseudo-labels.
With two dialogue understanding tasks, conversational semantic role labeling
and dialogue rewriting, chosen for a case study, we show that the models
trained with the friend-training framework achieve the best performance
compared to strong baselines.
Related papers
- Musketeer: Joint Training for Multi-task Vision Language Model with Task Explanation Prompts [75.75548749888029]
We present a vision-language model whose parameters are jointly trained on all tasks and fully shared among multiple heterogeneous tasks.
With a single model, Musketeer achieves results comparable to or better than strong baselines trained on single tasks, almost uniformly across multiple tasks.
arXiv Detail & Related papers (2023-05-11T17:57:49Z) - Context-Aware Language Modeling for Goal-Oriented Dialogue Systems [84.65707332816353]
We formulate goal-oriented dialogue as a partially observed Markov decision process.
We derive a simple and effective method to finetune language models in a goal-aware way.
We evaluate our method on a practical flight-booking task using AirDialogue.
arXiv Detail & Related papers (2022-04-18T17:23:11Z) - MetaICL: Learning to Learn In Context [87.23056864536613]
We introduce MetaICL, a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learn-ing on a large set of training tasks.
We show that MetaICL approaches (and sometimes beats) the performance of models fully finetuned on the target task training data, and outperforms much bigger models with nearly 8x parameters.
arXiv Detail & Related papers (2021-10-29T17:42:08Z) - Boosting a Model Zoo for Multi-Task and Continual Learning [15.110807414130923]
"Model Zoo" is an algorithm that builds an ensemble of models, each of which is very small, and it is trained on a smaller set of tasks.
Model Zoo achieves large gains in prediction accuracy compared to state-of-the-art methods in multi-task and continual learning.
arXiv Detail & Related papers (2021-06-06T04:25:09Z) - InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language
Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training.
We propose a new pre-training task based on contrastive learning.
By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z) - SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine
Teaching [81.45928589522032]
We parameterize modular task-oriented dialog systems using a Transformer-based auto-regressive language model.
We pre-train, on heterogeneous dialog corpora, a task-grounded response generation model.
Experiments show that SOLOIST creates new state-of-the-art on well-studied task-oriented dialog benchmarks.
arXiv Detail & Related papers (2020-05-11T17:58:34Z) - Can You Put it All Together: Evaluating Conversational Agents' Ability
to Blend Skills [31.42833993937429]
We investigate ways to combine models trained towards isolated capabilities.
We propose a new dataset, BlendedSkillTalk, to analyze how these capabilities would mesh together in a natural conversation.
Our experiments show that multi-tasking over several tasks that focus on particular capabilities results in better blended conversation performance.
arXiv Detail & Related papers (2020-04-17T20:51:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.