Related papers: Denoising Distantly Supervised Named Entity Recognition via a Hypergeometric Probabilistic Model

Denoising Distantly Supervised Named Entity Recognition via a Hypergeometric Probabilistic Model

URL: http://arxiv.org/abs/2106.09234v1
Date: Thu, 17 Jun 2021 04:01:25 GMT
Title: Denoising Distantly Supervised Named Entity Recognition via a Hypergeometric Probabilistic Model
Authors: Wenkai Zhang, Hongyu Lin, Xianpei Han, Le Sun, Huidan Liu, Zhicheng Wei, Nicholas Jing Yuan
Abstract summary: Hypergeometric Learning (HGL) is a denoising algorithm for distantly supervised named entity recognition. HGL takes both noise distribution and instance-level confidence into consideration. Experiments show that HGL can effectively denoise the weakly-labeled data retrieved from distant supervision.
Score: 26.76830553508229
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Denoising is the essential step for distant supervision based named entity recognition. Previous denoising methods are mostly based on instance-level confidence statistics, which ignore the variety of the underlying noise distribution on different datasets and entity types. This makes them difficult to be adapted to high noise rate settings. In this paper, we propose Hypergeometric Learning (HGL), a denoising algorithm for distantly supervised NER that takes both noise distribution and instance-level confidence into consideration. Specifically, during neural network training, we naturally model the noise samples in each batch following a hypergeometric distribution parameterized by the noise-rate. Then each instance in the batch is regarded as either correct or noisy one according to its label confidence derived from previous training step, as well as the noise distribution in this sampled batch. Experiments show that HGL can effectively denoise the weakly-labeled data retrieved from distant supervision, and therefore results in significant improvements on the trained models.

Related papers

Noise Conditional Variational Score Distillation [60.38982038894823]
Noise Conditional Variational Score Distillation (NCVSD) is a novel method for distilling pretrained diffusion models into generative denoisers.<n>By integrating this insight into the Variational Score Distillation framework, we enable scalable learning of generative denoisers.
arXiv Detail & Related papers (2025-06-11T06:01:39Z)
Mitigating Instance-Dependent Label Noise: Integrating Self-Supervised Pretraining with Pseudo-Label Refinement [3.272177633069322]
Real-world datasets often contain noisy labels due to human error, ambiguity, or resource constraints during the annotation process. We propose a novel framework that combines self-supervised learning using SimCLR with iterative pseudo-label refinement. Our approach significantly outperforms several state-of-the-art methods, particularly under high noise conditions.
arXiv Detail & Related papers (2024-12-06T09:56:49Z)
Robustness Enhancement in Neural Networks with Alpha-Stable Training Noise [0.0]
We explore the possibility of stronger robustness for non-Gaussian impulsive noise, specifically alpha-stable noise. By comparing the testing accuracy of models trained with Gaussian noise and alpha-stable noise on data corrupted by different noise, we find that training with alpha-stable noise is more effective than Gaussian noise. We propose a novel data augmentation method that replaces Gaussian noise, which is typically added to the training data, with alpha-stable noise.
arXiv Detail & Related papers (2023-11-17T10:00:47Z)
DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective. Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process. During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z)
Label Noise: Correcting the Forward-Correction [0.0]
Training neural network classifiers on datasets with label noise poses a risk of overfitting them to the noisy labels. We propose an approach to tackling overfitting caused by label noise. Motivated by this observation, we propose imposing a lower bound on the training loss to mitigate overfitting.
arXiv Detail & Related papers (2023-07-24T19:41:19Z)
Instance-dependent Noisy-label Learning with Graphical Model Based Noise-rate Estimation [16.283722126438125]
Label Noise Learning (LNL) incorporates a sample selection stage to differentiate clean and noisy-label samples. Such curriculum is sub-optimal since it does not consider the actual label noise rate in the training set. This paper addresses this issue with a new noise-rate estimation method that is easily integrated with most state-of-the-art (SOTA) LNL methods.
arXiv Detail & Related papers (2023-05-31T01:46:14Z)
Latent Class-Conditional Noise Model [54.56899309997246]
We introduce a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework. We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels. Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples.
arXiv Detail & Related papers (2023-02-19T15:24:37Z)
Optimizing the Noise in Self-Supervised Learning: from Importance Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs) We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data. We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z)
Deep Variation Prior: Joint Image Denoising and Noise Variance Estimation without Clean Data [2.3061446605472558]
This paper investigates the tasks of image denoising and noise variance estimation in a single, joint learning framework. We build upon DVP, an unsupervised deep learning framework, that simultaneously learns a denoiser and estimates noise variances. Our method does not require any clean training images or an external step of noise estimation, and instead, approximates the minimum mean squared error denoisers using only a set of noisy images.
arXiv Detail & Related papers (2022-09-19T17:29:32Z)
Learning from Noisy Labels with Coarse-to-Fine Sample Credibility Modeling [22.62790706276081]
Training deep neural network (DNN) with noisy labels is practically challenging. Previous efforts tend to handle part or full data in a unified denoising flow. We propose a coarse-to-fine robust learning method called CREMA to handle noisy data in a divide-and-conquer manner.
arXiv Detail & Related papers (2022-08-23T02:06:38Z)
Identifying Hard Noise in Long-Tailed Sample Distribution [76.16113794808001]
We introduce Noisy Long-Tailed Classification (NLT) Most de-noising methods fail to identify the hard noises. We design an iterative noisy learning framework called Hard-to-Easy (H2E)
arXiv Detail & Related papers (2022-07-27T09:03:03Z)
The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators. In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z)
Confidence Scores Make Instance-dependent Label-noise Learning Possible [129.84497190791103]
In learning with noisy labels, for every instance, its label can randomly walk to other classes following a transition distribution which is named a noise model. We introduce confidence-scored instance-dependent noise (CSIDN), where each instance-label pair is equipped with a confidence score. We find with the help of confidence scores, the transition distribution of each instance can be approximately estimated.
arXiv Detail & Related papers (2020-01-11T16:15:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.