Denoising Enhanced Distantly Supervised Ultrafine Entity Typing
- URL: http://arxiv.org/abs/2210.09599v1
- Date: Tue, 18 Oct 2022 05:20:16 GMT
- Title: Denoising Enhanced Distantly Supervised Ultrafine Entity Typing
- Authors: Yue Zhang, Hongliang Fei, Ping Li
- Abstract summary: We build a noise model to estimate the unknown labeling noise distribution over input contexts and noisy type labels.
With the noise model, more trustworthy labels can be recovered by subtracting the estimated noise from the input.
We propose an entity typing model, which adopts a bi-encoder architecture, is trained on the denoised data.
- Score: 36.14308856513851
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the task of distantly supervised (DS) ultra-fine entity typing has
received significant attention. However, DS data is noisy and often suffers
from missing or wrong labeling issues resulting in low precision and low
recall. This paper proposes a novel ultra-fine entity typing model with
denoising capability. Specifically, we build a noise model to estimate the
unknown labeling noise distribution over input contexts and noisy type labels.
With the noise model, more trustworthy labels can be recovered by subtracting
the estimated noise from the input. Furthermore, we propose an entity typing
model, which adopts a bi-encoder architecture, is trained on the denoised data.
Finally, the noise model and entity typing model are trained iteratively to
enhance each other. We conduct extensive experiments on the Ultra-Fine entity
typing dataset as well as OntoNotes dataset and demonstrate that our approach
significantly outperforms other baseline methods.
Related papers
- NoiseBench: Benchmarking the Impact of Real Label Noise on Named Entity Recognition [3.726602636064681]
We present an analysis that shows that real noise is significantly more challenging than simulated noise.
We show that current state-of-the-art models for noise-robust learning fall far short of their theoretically achievable upper bound.
arXiv Detail & Related papers (2024-05-13T10:20:31Z) - Noisy Label Processing for Classification: A Survey [2.8821062918162146]
In the long, tedious process of data annotation, annotators are prone to make mistakes, resulting in incorrect labels of images.
It is crucial to combat noisy labels for computer vision tasks, especially for classification tasks.
We propose an algorithm to generate a synthetic label noise pattern guided by real-world data.
arXiv Detail & Related papers (2024-04-05T15:11:09Z) - SoftPatch: Unsupervised Anomaly Detection with Noisy Data [67.38948127630644]
This paper considers label-level noise in image sensory anomaly detection for the first time.
We propose a memory-based unsupervised AD method, SoftPatch, which efficiently denoises the data at the patch level.
Compared with existing methods, SoftPatch maintains a strong modeling ability of normal data and alleviates the overconfidence problem in coreset.
arXiv Detail & Related papers (2024-03-21T08:49:34Z) - Robust Tiny Object Detection in Aerial Images amidst Label Noise [50.257696872021164]
This study addresses the issue of tiny object detection under noisy label supervision.
We propose a DeNoising Tiny Object Detector (DN-TOD), which incorporates a Class-aware Label Correction scheme.
Our method can be seamlessly integrated into both one-stage and two-stage object detection pipelines.
arXiv Detail & Related papers (2024-01-16T02:14:33Z) - Improving a Named Entity Recognizer Trained on Noisy Data with a Few
Clean Instances [55.37242480995541]
We propose to denoise noisy NER data with guidance from a small set of clean instances.
Along with the main NER model we train a discriminator model and use its outputs to recalibrate the sample weights.
Results on public crowdsourcing and distant supervision datasets show that the proposed method can consistently improve performance with a small guidance set.
arXiv Detail & Related papers (2023-10-25T17:23:37Z) - Learning to Correct Noisy Labels for Fine-Grained Entity Typing via
Co-Prediction Prompt Tuning [9.885278527023532]
We introduce Co-Prediction Prompt Tuning for noise correction in FET.
We integrate prediction results to recall labeled labels and utilize a differentiated margin to identify inaccurate labels.
Experimental results on three widely-used FET datasets demonstrate that our noise correction approach significantly enhances the quality of training samples.
arXiv Detail & Related papers (2023-10-23T06:04:07Z) - Improving the Robustness of Summarization Models by Detecting and
Removing Input Noise [50.27105057899601]
We present a large empirical study quantifying the sometimes severe loss in performance from different types of input noise for a range of datasets and model sizes.
We propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any training, auxiliary models, or even prior knowledge of the type of noise.
arXiv Detail & Related papers (2022-12-20T00:33:11Z) - Is BERT Robust to Label Noise? A Study on Learning with Noisy Labels in
Text Classification [23.554544399110508]
Wrong labels in training data occur when human annotators make mistakes or when the data is generated via weak or distant supervision.
It has been shown that complex noise-handling techniques are required to prevent models from fitting this label noise.
We show in this work that, for text classification tasks with modern NLP models like BERT, over a variety of noise types, existing noisehandling methods do not always improve its performance, and may even deteriorate it.
arXiv Detail & Related papers (2022-04-20T10:24:19Z) - Analysing the Noise Model Error for Realistic Noisy Label Data [14.766574408868806]
We study the quality of estimated noise models from the theoretical side by deriving the expected error of the noise model.
We also publish NoisyNER, a new noisy label dataset from the NLP domain.
arXiv Detail & Related papers (2021-01-24T17:45:15Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - Deep k-NN for Noisy Labels [55.97221021252733]
We show that a simple $k$-nearest neighbor-based filtering approach on the logit layer of a preliminary model can remove mislabeled data and produce more accurate models than many recently proposed methods.
arXiv Detail & Related papers (2020-04-26T05:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.