Rethinking Negative Sampling for Unlabeled Entity Problem in Named
Entity Recognition
- URL: http://arxiv.org/abs/2108.11607v2
- Date: Fri, 27 Aug 2021 03:44:07 GMT
- Title: Rethinking Negative Sampling for Unlabeled Entity Problem in Named
Entity Recognition
- Authors: Yangming Li, Lemao Liu, Shuming Shi
- Abstract summary: Unlabeled entities seriously degrade the performances of named entity recognition models.
We analyze why negative sampling succeeds both theoretically and empirically.
We propose a weighted and adaptive sampling distribution for negative sampling.
- Score: 47.273602658066196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many situations (e.g., distant supervision), unlabeled entity problem
seriously degrades the performances of named entity recognition (NER) models.
Recently, this issue has been well addressed by a notable approach based on
negative sampling. In this work, we perform two studies along this direction.
Firstly, we analyze why negative sampling succeeds both theoretically and
empirically. Based on the observation that named entities are highly sparse in
datasets, we show a theoretical guarantee that, for a long sentence, the
probability of containing no unlabeled entities in sampled negatives is high.
Missampling tests on synthetic datasets have verified our guarantee in
practice. Secondly, to mine hard negatives and further reduce missampling
rates, we propose a weighted and adaptive sampling distribution for negative
sampling. Experiments on synthetic datasets and well-annotated datasets show
that our method significantly improves negative sampling in robustness and
effectiveness. We also have achieved new state-of-the-art results on real-world
datasets.
Related papers
- Learning with Imbalanced Noisy Data by Preventing Bias in Sample
Selection [82.43311784594384]
Real-world datasets contain not only noisy labels but also class imbalance.
We propose a simple yet effective method to address noisy labels in imbalanced datasets.
arXiv Detail & Related papers (2024-02-17T10:34:53Z) - Better Sampling of Negatives for Distantly Supervised Named Entity
Recognition [39.264878763160766]
We propose a simple and straightforward approach for selecting the top negative samples that have high similarities with all the positive samples for training.
Our method achieves consistent performance improvements on four distantly supervised NER datasets.
arXiv Detail & Related papers (2023-05-22T15:35:39Z) - Bayesian Self-Supervised Contrastive Learning [16.903874675729952]
This paper proposes a new self-supervised contrastive loss called the BCL loss.
The key idea is to design the desired sampling distribution for sampling hard true negative samples under the Bayesian framework.
Experiments validate the effectiveness and superiority of the BCL loss.
arXiv Detail & Related papers (2023-01-27T12:13:06Z) - SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval [126.22182758461244]
We show that according to the measured relevance scores, the negatives ranked around the positives are generally more informative and less likely to be false negatives.
We propose a simple ambiguous negatives sampling method, SimANS, which incorporates a new sampling probability distribution to sample more ambiguous negatives.
arXiv Detail & Related papers (2022-10-21T07:18:05Z) - Entity Aware Negative Sampling with Auxiliary Loss of False Negative
Prediction for Knowledge Graph Embedding [0.0]
We propose a novel method called Entity Aware Negative Sampling (EANS)
EANS is able to sample negative entities resemble to positive one by adopting Gaussian distribution to the aligned entity index space.
The proposed method can generate high-quality negative samples regardless of negative sample size and effectively mitigate the influence of false negative samples.
arXiv Detail & Related papers (2022-10-12T14:27:51Z) - Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework.
We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z) - Simplify and Robustify Negative Sampling for Implicit Collaborative
Filtering [42.832851785261894]
In this paper, we first provide a novel understanding of negative instances by empirically observing that only a few instances are potentially important for model learning.
We then tackle the untouched false negative problem by favouring high-variance samples stored in memory.
Empirical results on two synthetic datasets and three real-world datasets demonstrate both robustness and superiorities of our negative sampling method.
arXiv Detail & Related papers (2020-09-07T19:08:26Z) - Understanding Negative Sampling in Graph Representation Learning [87.35038268508414]
We show that negative sampling is as important as positive sampling in determining the optimization objective and the resulted variance.
We propose Metropolis-Hastings (MCNS) to approximate the positive distribution with self-contrast approximation and accelerate negative sampling by Metropolis-Hastings.
We evaluate our method on 5 datasets that cover extensive downstream graph learning tasks, including link prediction, node classification and personalized recommendation.
arXiv Detail & Related papers (2020-05-20T06:25:21Z) - MixPUL: Consistency-based Augmentation for Positive and Unlabeled
Learning [8.7382177147041]
We propose a simple yet effective data augmentation method, coinedalgo, based on emphconsistency regularization.
algoincorporates supervised and unsupervised consistency training to generate augmented data.
We show thatalgoachieves an averaged improvement of classification error from 16.49 to 13.09 on the CIFAR-10 dataset across different positive data amount.
arXiv Detail & Related papers (2020-04-20T15:43:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.