Embracing Ambiguity: Shifting the Training Target of NLI Models
- URL: http://arxiv.org/abs/2106.03020v1
- Date: Sun, 6 Jun 2021 03:18:53 GMT
- Title: Embracing Ambiguity: Shifting the Training Target of NLI Models
- Authors: Johannes Mario Meissner, Napat Thumwanit, Saku Sugawara, Akiko Aizawa
- Abstract summary: We explore the option of training on the estimated label distribution of the annotators in the Natural Language Inference (NLI) task.
We show it is possible to reduce ChaosNLI divergence scores when finetuning on this data, a promising first step towards learning how to capture linguistic ambiguity.
- Score: 21.153381668435145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural Language Inference (NLI) datasets contain examples with highly
ambiguous labels. While many research works do not pay much attention to this
fact, several recent efforts have been made to acknowledge and embrace the
existence of ambiguity, such as UNLI and ChaosNLI. In this paper, we explore
the option of training directly on the estimated label distribution of the
annotators in the NLI task, using a learning loss based on this ambiguity
distribution instead of the gold-labels. We prepare AmbiNLI, a trial dataset
obtained from readily available sources, and show it is possible to reduce
ChaosNLI divergence scores when finetuning on this data, a promising first step
towards learning how to capture linguistic ambiguity. Additionally, we show
that training on the same amount of data but targeting the ambiguity
distribution instead of gold-labels can result in models that achieve higher
performance and learn better representations for downstream tasks.
Related papers
- Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data.
This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning.
We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z) - Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Self-Knowledge Distillation for Learning Ambiguity [11.755814660833549]
Recent language models often over-confidently predict a single label without consideration for its correctness.
We propose a novel self-knowledge distillation method that enables models to learn label distributions more accurately.
We validate our method on diverse NLU benchmark datasets and the experimental results demonstrate its effectiveness in producing better label distributions.
arXiv Detail & Related papers (2024-06-14T05:11:32Z) - Adaptive Integration of Partial Label Learning and Negative Learning for
Enhanced Noisy Label Learning [23.847160480176697]
We propose a simple yet powerful idea called textbfNPN, which revolutionizes textbfNoisy label learning.
We generate reliable complementary labels using all non-candidate labels for NL to enhance model robustness through indirect supervision.
Experiments conducted on both synthetically corrupted and real-world noisy datasets demonstrate the superiority of NPN compared to other state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2023-12-15T03:06:19Z) - All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - Learning with Partial Labels from Semi-supervised Perspective [28.735185883881172]
Partial Label (PL) learning refers to the task of learning from partially labeled data.
We propose a novel PL learning method, namely Partial Label learning with Semi-Supervised Perspective (PLSP)
PLSP significantly outperforms the existing PL baseline methods, especially on high ambiguity levels.
arXiv Detail & Related papers (2022-11-24T15:12:16Z) - Learning to Infer from Unlabeled Data: A Semi-supervised Learning
Approach for Robust Natural Language Inference [47.293189105900524]
Natural Language Inference (NLI) aims at predicting the relation between a pair of sentences (premise and hypothesis) as entailment, contradiction or semantic independence.
Deep learning models have shown promising performance for NLI in recent years, they rely on large scale expensive human-annotated datasets.
Semi-supervised learning (SSL) is a popular technique for reducing the reliance on human annotation by leveraging unlabeled data for training.
arXiv Detail & Related papers (2022-11-05T20:34:08Z) - Discriminatively-Tuned Generative Classifiers for Robust Natural
Language Inference [59.62779187457773]
We propose a generative classifier for natural language inference (NLI)
We compare it to five baselines, including discriminative models and large-scale pretrained language representation models like BERT.
Experiments show that GenNLI outperforms both discriminative and pretrained baselines across several challenging NLI experimental settings.
arXiv Detail & Related papers (2020-10-08T04:44:00Z) - ESL: Entropy-guided Self-supervised Learning for Domain Adaptation in
Semantic Segmentation [35.03150829133562]
We propose Entropy-guided Self-supervised Learning, leveraging entropy as the confidence indicator for producing more accurate pseudo-labels.
On different UDA benchmarks, ESL consistently outperforms strong SSL baselines and achieves state-of-the-art results.
arXiv Detail & Related papers (2020-06-15T18:10:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.