"Diversity and Uncertainty in Moderation" are the Key to Data Selection
for Multilingual Few-shot Transfer
- URL: http://arxiv.org/abs/2206.15010v1
- Date: Thu, 30 Jun 2022 04:22:27 GMT
- Title: "Diversity and Uncertainty in Moderation" are the Key to Data Selection
for Multilingual Few-shot Transfer
- Authors: Shanu Kumar, Sandipan Dandapat, Monojit Choudhury
- Abstract summary: This paper explores various strategies for selecting data for annotation that can result in a better few-shot transfer.
The proposed approaches rely on multiple measures such as data entropy using $n$-gram language model, predictive entropy, and gradient embedding.
Experiments show that the gradient and loss embedding-based strategies consistently outperform random data selection baselines.
- Score: 13.268758633770595
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Few-shot transfer often shows substantial gain over zero-shot
transfer~\cite{lauscher2020zero}, which is a practically useful trade-off
between fully supervised and unsupervised learning approaches for multilingual
pretrained model-based systems. This paper explores various strategies for
selecting data for annotation that can result in a better few-shot transfer.
The proposed approaches rely on multiple measures such as data entropy using
$n$-gram language model, predictive entropy, and gradient embedding. We propose
a loss embedding method for sequence labeling tasks, which induces diversity
and uncertainty sampling similar to gradient embedding. The proposed data
selection strategies are evaluated and compared for POS tagging, NER, and NLI
tasks for up to 20 languages. Our experiments show that the gradient and loss
embedding-based strategies consistently outperform random data selection
baselines, with gains varying with the initial performance of the zero-shot
transfer. Furthermore, the proposed method shows similar trends in improvement
even when the model is fine-tuned using a lower proportion of the original
task-specific labeled training data for zero-shot transfer.
Related papers
- Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot Techniques [5.735035463793008]
We show that for Argument Mining, data transfer obtains better results than model-transfer.
For few-shot, the type of task (length and complexity of the sequence spans) and sampling method prove to be crucial.
arXiv Detail & Related papers (2024-07-04T08:59:17Z) - Transferable Candidate Proposal with Bounded Uncertainty [1.8130068086063336]
We introduce a new experimental design, coined as Proposal Candidate, to find transferable data candidates.
A data selection algorithm is proposed, namely Transferable candidate proposal with Bounded Uncertainty (TBU)
When transferred to different model configurations, TBU consistency improves performance in existing active learning algorithms.
arXiv Detail & Related papers (2023-12-07T08:47:28Z) - Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment.
We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z) - Squeezing Backbone Feature Distributions to the Max for Efficient
Few-Shot Learning [3.1153758106426603]
Few-shot classification is a challenging problem due to the uncertainty caused by using few labelled samples.
We propose a novel transfer-based method which aims at processing the feature vectors so that they become closer to Gaussian-like distributions.
In the case of transductive few-shot learning where unlabelled test samples are available during training, we also introduce an optimal-transport inspired algorithm to boost even further the achieved performance.
arXiv Detail & Related papers (2021-10-18T16:29:17Z) - Few-shot Learning via Dependency Maximization and Instance Discriminant
Analysis [21.8311401851523]
We study the few-shot learning problem, where a model learns to recognize new objects with extremely few labeled data per category.
We propose a simple approach to exploit unlabeled data accompanying the few-shot task for improving few-shot performance.
arXiv Detail & Related papers (2021-09-07T02:19:01Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond [58.80417796087894]
Cross-lingual adaptation with multilingual pre-trained language models (mPTLMs) mainly consists of two lines of works: zero-shot approach and translation-based approach.
We propose a novel framework to consolidate the zero-shot approach and the translation-based approach for better adaptation performance.
arXiv Detail & Related papers (2020-10-23T13:47:01Z) - Leveraging the Feature Distribution in Transfer-based Few-Shot Learning [2.922007656878633]
Few-shot classification is a challenging problem due to the uncertainty caused by using few labelled samples.
We propose a novel transfer-based method that builds on two successive steps: 1) preprocessing the feature vectors so that they become closer to Gaussian-like distributions, and 2) leveraging this preprocessing using an optimal-transport inspired algorithm.
We prove the ability of the proposed methodology to achieve state-of-the-art accuracy with various datasets, backbone architectures and few-shot settings.
arXiv Detail & Related papers (2020-06-06T07:32:08Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.