Diversity-Aware Batch Active Learning for Dependency Parsing
- URL: http://arxiv.org/abs/2104.13936v1
- Date: Wed, 28 Apr 2021 18:00:05 GMT
- Title: Diversity-Aware Batch Active Learning for Dependency Parsing
- Authors: Tianze Shi, Adrian Benton, Igor Malioutov, Ozan \.Irsoy
- Abstract summary: We show that selecting diverse batches with DPPs is superior to strong selection strategies that do not enforce batch diversity.
Our diversityaware strategy is robust under a corpus duplication setting, where diversity-agnostic sampling strategies exhibit significant degradation.
- Score: 12.579809393060858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While the predictive performance of modern statistical dependency parsers
relies heavily on the availability of expensive expert-annotated treebank data,
not all annotations contribute equally to the training of the parsers. In this
paper, we attempt to reduce the number of labeled examples needed to train a
strong dependency parser using batch active learning (AL). In particular, we
investigate whether enforcing diversity in the sampled batches, using
determinantal point processes (DPPs), can improve over their diversity-agnostic
counterparts. Simulation experiments on an English newswire corpus show that
selecting diverse batches with DPPs is superior to strong selection strategies
that do not enforce batch diversity, especially during the initial stages of
the learning process. Additionally, our diversityaware strategy is robust under
a corpus duplication setting, where diversity-agnostic sampling strategies
exhibit significant degradation.
Related papers
- Effective Demonstration Annotation for In-Context Learning via Language Model-Based Determinantal Point Process [45.632012199451275]
In-context learning (ICL) is a few-shot learning paradigm that involves learning mappings through input-output pairs.
Existing works are highly dependent on large-scale labeled support sets, not always feasible in practical scenarios.
We introduce the Language Model-based Determinant Point Process (LM-DPP) that simultaneously considers the uncertainty and diversity of unlabeled instances for optimal selection.
arXiv Detail & Related papers (2024-08-04T18:08:15Z) - Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles [95.49699178874683]
We propose DiffDiv, an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs)
We show that DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features.
We show that DPM-guided diversification is sufficient to remove dependence on shortcut cues, without a need for additional supervised signals.
arXiv Detail & Related papers (2023-11-23T15:47:33Z) - Adaptive Gating in Mixture-of-Experts based Language Models [7.936874532105228]
Sparsely activated mixture-of-experts (MoE) has emerged as a promising solution for scaling models.
This paper introduces adaptive gating in MoE, a flexible training strategy that allows tokens to be processed by a variable number of experts.
arXiv Detail & Related papers (2023-10-11T04:30:18Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Multi-View Knowledge Distillation from Crowd Annotations for
Out-of-Domain Generalization [53.24606510691877]
We propose new methods for acquiring soft-labels from crowd-annotations by aggregating the distributions produced by existing methods.
We demonstrate that these aggregation methods lead to the most consistent performance across four NLP tasks on out-of-domain test sets.
arXiv Detail & Related papers (2022-12-19T12:40:18Z) - Exploiting Diversity of Unlabeled Data for Label-Efficient
Semi-Supervised Active Learning [57.436224561482966]
Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling.
We introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting.
Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings.
arXiv Detail & Related papers (2022-07-25T16:11:55Z) - Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning.
Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z) - BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives [0.0]
BERT has revolutionized the NLP field by enabling transfer learning with large language models.
This article studies how to better cope with the different embeddings provided by the BERT output layer and the usage of language-specific instead of multilingual models.
arXiv Detail & Related papers (2022-01-10T15:05:05Z) - Deep Active Learning for Sequence Labeling Based on Diversity and
Uncertainty in Gradient [5.33024001730262]
We show that the amount of labeled training data can be reduced using active learning when it incorporates both uncertainty and diversity in the sequence labeling task.
We examined the effects of our sequence-based approach by selecting weighted diverse in the gradient embedding approach across multiple tasks, datasets, models, and consistently outperform classic uncertainty-based sampling and diversity-based sampling.
arXiv Detail & Related papers (2020-11-27T06:03:27Z) - Reducing Confusion in Active Learning for Part-Of-Speech Tagging [100.08742107682264]
Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost.
We study the problem of selecting instances which maximally reduce the confusion between particular pairs of output tags.
Our proposed AL strategy outperforms other AL strategies by a significant margin.
arXiv Detail & Related papers (2020-11-02T06:24:58Z) - Informed Sampling for Diversity in Concept-to-Text NLG [8.883733362171034]
We propose an Imitation Learning approach to explore the level of diversity that a language generation model can reliably produce.
Specifically, we augment the decoding process with a meta-classifier trained to distinguish which words at any given timestep will lead to high-quality output.
arXiv Detail & Related papers (2020-04-29T17:43:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.