Related papers: Denoising Multi-Source Weak Supervision for Neural Text Classification

Denoising Multi-Source Weak Supervision for Neural Text Classification

URL: http://arxiv.org/abs/2010.04582v1
Date: Fri, 9 Oct 2020 13:57:52 GMT
Title: Denoising Multi-Source Weak Supervision for Neural Text Classification
Authors: Wendi Ren, Yinghao Li, Hanting Su, David Kartchner, Cassie Mitchell, Chao Zhang
Abstract summary: We study the problem of learning neural text classifiers without using any labeled data, but only easy-to-provide rules as multiple weak supervision sources. This problem is challenging because rule-induced weak labels are often noisy and incomplete. We design a label denoiser, which estimates the source reliability using a conditional soft attention mechanism and then reduces label noise by aggregating rule-annotated weak labels.
Score: 9.099703420721701
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of learning neural text classifiers without using any labeled data, but only easy-to-provide rules as multiple weak supervision sources. This problem is challenging because rule-induced weak labels are often noisy and incomplete. To address these two challenges, we design a label denoiser, which estimates the source reliability using a conditional soft attention mechanism and then reduces label noise by aggregating rule-annotated weak labels. The denoised pseudo labels then supervise a neural classifier to predicts soft labels for unmatched samples, which address the rule coverage issue. We evaluate our model on five benchmarks for sentiment, topic, and relation classifications. The results show that our model outperforms state-of-the-art weakly-supervised and semi-supervised methods consistently, and achieves comparable performance with fully-supervised methods even without any labeled data. Our code can be found at https://github.com/weakrules/Denoise-multi-weak-sources.

Related papers

Trusted Multi-view Learning with Label Noise [17.458306450909316]
Multi-view learning methods often focus on improving decision accuracy while neglecting the decision uncertainty. We propose a trusted multi-view noise refining method to solve this problem. We empirically compare TMNR with state-of-the-art trusted multi-view learning and label noise learning baselines on 5 publicly available datasets.
arXiv Detail & Related papers (2024-04-18T06:47:30Z)
Label-Retrieval-Augmented Diffusion Models for Learning from Noisy Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications. In this paper, we reformulate the label-noise problem from a generative-model perspective. Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z)
BadLabel: A Robust Perspective on Evaluating and Enhancing Label-noise Learning [113.8799653759137]
We introduce a novel label noise type called BadLabel, which can significantly degrade the performance of existing LNL algorithms by a large margin. BadLabel is crafted based on the label-flipping attack against standard classification. We propose a robust LNL method that perturbs the labels in an adversarial manner at each epoch to make the loss values of clean and noisy labels again distinguishable.
arXiv Detail & Related papers (2023-05-28T06:26:23Z)
Robust Training of Graph Neural Networks via Noise Governance [27.767913371777247]
Graph Neural Networks (GNNs) have become widely-used models for semi-supervised learning. In this paper, we consider an important yet challenging scenario where labels on nodes of graphs are not only noisy but also scarce. We propose a novel RTGNN framework that achieves better robustness by learning to explicitly govern label noise.
arXiv Detail & Related papers (2022-11-12T09:25:32Z)
Learning from Noisy Labels with Coarse-to-Fine Sample Credibility Modeling [22.62790706276081]
Training deep neural network (DNN) with noisy labels is practically challenging. Previous efforts tend to handle part or full data in a unified denoising flow. We propose a coarse-to-fine robust learning method called CREMA to handle noisy data in a divide-and-conquer manner.
arXiv Detail & Related papers (2022-08-23T02:06:38Z)
Label Noise-Resistant Mean Teaching for Weakly Supervised Fake News Detection [93.6222609806278]
We propose a novel label noise-resistant mean teaching approach (LNMT) for weakly supervised fake news detection. LNMT leverages unlabeled news and feedback comments of users to enlarge the amount of training data. LNMT establishes a mean teacher framework equipped with label propagation and label reliability estimation.
arXiv Detail & Related papers (2022-06-10T16:01:58Z)
S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise. In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space. Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z)
Robust Long-Tailed Learning under Label Noise [50.00837134041317]
This work investigates the label noise problem under long-tailed label distribution. We propose a robust framework,algo, that realizes noise detection for long-tailed learning. Our framework can naturally leverage semi-supervised learning algorithms to further improve the generalisation.
arXiv Detail & Related papers (2021-08-26T03:45:00Z)
An Ensemble Noise-Robust K-fold Cross-Validation Selection Method for Noisy Labels [0.9699640804685629]
Large-scale datasets tend to contain mislabeled samples that can be memorized by deep neural networks (DNNs) We present Ensemble Noise-robust K-fold Cross-Validation Selection (E-NKCVS) to effectively select clean samples from noisy data. We evaluate our approach on various image and text classification tasks where the labels have been manually corrupted with different noise ratios.
arXiv Detail & Related papers (2021-07-06T02:14:52Z)
Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances. Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z)
A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks. We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.