Can You Label Less by Using Out-of-Domain Data? Active & Transfer
Learning with Few-shot Instructions
- URL: http://arxiv.org/abs/2211.11798v1
- Date: Mon, 21 Nov 2022 19:03:31 GMT
- Title: Can You Label Less by Using Out-of-Domain Data? Active & Transfer
Learning with Few-shot Instructions
- Authors: Rafal Kocielnik, Sara Kangaslahti, Shrimai Prabhumoye, Meena Hari, R.
Michael Alvarez, Anima Anandkumar
- Abstract summary: We propose a novel Active Transfer Few-shot Instructions (ATF) approach which requires no fine-tuning.
ATF leverages the internal linguistic knowledge of pre-trained language models (PLMs) to facilitate the transfer of information.
We show that annotation of just a few target-domain samples via active learning can be beneficial for transfer, but the impact diminishes with more annotation effort.
- Score: 58.69255121795761
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Labeling social-media data for custom dimensions of toxicity and social bias
is challenging and labor-intensive. Existing transfer and active learning
approaches meant to reduce annotation effort require fine-tuning, which suffers
from over-fitting to noise and can cause domain shift with small sample sizes.
In this work, we propose a novel Active Transfer Few-shot Instructions (ATF)
approach which requires no fine-tuning. ATF leverages the internal linguistic
knowledge of pre-trained language models (PLMs) to facilitate the transfer of
information from existing pre-labeled datasets (source-domain task) with
minimum labeling effort on unlabeled target data (target-domain task). Our
strategy can yield positive transfer achieving a mean AUC gain of 10.5%
compared to no transfer with a large 22b parameter PLM. We further show that
annotation of just a few target-domain samples via active learning can be
beneficial for transfer, but the impact diminishes with more annotation effort
(26% drop in gain between 100 and 2000 annotated examples). Finally, we find
that not all transfer scenarios yield a positive gain, which seems related to
the PLMs initial performance on the target-domain task.
Related papers
- Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence [60.37934652213881]
Domain Adaptation (DA) facilitates knowledge transfer from a source domain to a related target domain.
This paper investigates a practical DA paradigm, namely Source data-Free Active Domain Adaptation (SFADA), where source data becomes inaccessible during adaptation.
We present learn from the learnt (LFTL), a novel paradigm for SFADA to leverage the learnt knowledge from the source pretrained model and actively iterated models without extra overhead.
arXiv Detail & Related papers (2024-07-26T17:51:58Z) - An Efficient Active Learning Pipeline for Legal Text Classification [2.462514989381979]
We propose a pipeline for effectively using active learning with pre-trained language models in the legal domain.
We use knowledge distillation to guide the model's embeddings to a semantically meaningful space.
Our experiments on Contract-NLI, adapted to the classification task, and LEDGAR benchmarks show that our approach outperforms standard AL strategies.
arXiv Detail & Related papers (2022-11-15T13:07:02Z) - An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding [65.75873687351553]
This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain.
Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
arXiv Detail & Related papers (2022-10-21T04:36:46Z) - Bi-level Alignment for Cross-Domain Crowd Counting [113.78303285148041]
Current methods rely on external data for training an auxiliary task or apply an expensive coarse-to-fine estimation.
We develop a new adversarial learning based method, which is simple and efficient to apply.
We evaluate our approach on five real-world crowd counting benchmarks, where we outperform existing approaches by a large margin.
arXiv Detail & Related papers (2022-05-12T02:23:25Z) - Textual Entailment for Event Argument Extraction: Zero- and Few-Shot
with Multi-Source Learning [22.531385318852426]
Recent work has shown that NLP tasks can be recasted as Textual Entailment tasks using verbalizations.
We show that entailment is also effective in Event Argument Extraction (EAE), reducing the need of manual annotation to 50% and 20%.
arXiv Detail & Related papers (2022-05-03T08:53:55Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Multi-Stage Pre-training for Low-Resource Domain Adaptation [24.689862495171408]
Current approaches directly adapt a pre-trained language model (LM) on in-domain text before fine-tuning to downstream tasks.
We show that extending the vocabulary of the LM with domain-specific terms leads to further gains.
We apply these approaches incrementally on a pre-trained Roberta-large LM and show considerable performance gain on three tasks in the IT domain.
arXiv Detail & Related papers (2020-10-12T17:57:00Z) - Continuous Transfer Learning with Label-informed Distribution Alignment [42.34180707803632]
We study a novel continuous transfer learning setting with a time evolving target domain.
One major challenge associated with continuous transfer learning is the potential occurrence of negative transfer.
We propose a generic adversarial Variational Auto-encoder framework named TransLATE.
arXiv Detail & Related papers (2020-06-05T04:44:58Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.