Rank-Aware Negative Training for Semi-Supervised Text Classification
- URL: http://arxiv.org/abs/2306.07621v1
- Date: Tue, 13 Jun 2023 08:41:36 GMT
- Title: Rank-Aware Negative Training for Semi-Supervised Text Classification
- Authors: Ahmed Murtadha, Shengfeng Pan, Wen Bo, Jianlin Su, Xinxin Cao, Wenze
Zhang, Yunfeng Liu
- Abstract summary: Semi-supervised text classification-based paradigms (SSTC) typically employ the spirit of self-training.
This paper presents a Rank-aware Negative Training (RNT) framework to address SSTC in learning with noisy label manner.
- Score: 3.105629960108712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised text classification-based paradigms (SSTC) typically employ
the spirit of self-training. The key idea is to train a deep classifier on
limited labeled texts and then iteratively predict the unlabeled texts as their
pseudo-labels for further training. However, the performance is largely
affected by the accuracy of pseudo-labels, which may not be significant in
real-world scenarios. This paper presents a Rank-aware Negative Training (RNT)
framework to address SSTC in learning with noisy label manner. To alleviate the
noisy information, we adapt a reasoning with uncertainty-based approach to rank
the unlabeled texts based on the evidential support received from the labeled
texts. Moreover, we propose the use of negative training to train RNT based on
the concept that ``the input instance does not belong to the complementary
label''. A complementary label is randomly selected from all labels except the
label on-target. Intuitively, the probability of a true label serving as a
complementary label is low and thus provides less noisy information during the
training, resulting in better performance on the test data. Finally, we
evaluate the proposed solution on various text classification benchmark
datasets. Our extensive experiments show that it consistently overcomes the
state-of-the-art alternatives in most scenarios and achieves competitive
performance in the others. The code of RNT is publicly available
at:https://github.com/amurtadha/RNT.
Related papers
- Determined Multi-Label Learning via Similarity-Based Prompt [12.428779617221366]
In multi-label classification, each training instance is associated with multiple class labels simultaneously.
To alleviate this problem, a novel labeling setting termed textitDetermined Multi-Label Learning (DMLL) is proposed.
arXiv Detail & Related papers (2024-03-25T07:08:01Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - Label Noise-Resistant Mean Teaching for Weakly Supervised Fake News
Detection [93.6222609806278]
We propose a novel label noise-resistant mean teaching approach (LNMT) for weakly supervised fake news detection.
LNMT leverages unlabeled news and feedback comments of users to enlarge the amount of training data.
LNMT establishes a mean teacher framework equipped with label propagation and label reliability estimation.
arXiv Detail & Related papers (2022-06-10T16:01:58Z) - Context-based Virtual Adversarial Training for Text Classification with
Noisy Labels [1.9508698179748525]
We propose context-based virtual adversarial training (ConVAT) to prevent a text classifier from overfitting to noisy labels.
Unlike the previous works, the proposed method performs the adversarial training at the context level rather than the inputs.
We conduct extensive experiments on four text classification datasets with two types of label noises.
arXiv Detail & Related papers (2022-05-29T14:19:49Z) - Label Semantic Aware Pre-training for Few-shot Text Classification [53.80908620663974]
We propose Label Semantic Aware Pre-training (LSAP) to improve the generalization and data efficiency of text classification systems.
LSAP incorporates label semantics into pre-trained generative models (T5 in our case) by performing secondary pre-training on labeled sentences from a variety of domains.
arXiv Detail & Related papers (2022-04-14T17:33:34Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - SENT: Sentence-level Distant Relation Extraction via Negative Training [45.98674099149065]
Using bag labels for sentence-level training will introduce much noise, thus severely degrading performance.
We propose the use of negative training (NT) in which a model is trained using complementary labels regarding that the instance does not belong to these complementary labels"
Based on NT, we propose a sentence-level framework, SENT, for distant relation extraction.
arXiv Detail & Related papers (2021-06-22T06:49:05Z) - Improving Pretrained Models for Zero-shot Multi-label Text
Classification through Reinforced Label Hierarchy Reasoning [18.531022315325583]
Exploiting label hierarchies has become a promising approach to tackling the zero-shot multi-label text classification problem.
We propose a Reinforced Label Hierarchy Reasoning (RLHR) approach to encourage interdependence among labels in the hierarchies during training.
arXiv Detail & Related papers (2021-04-04T19:14:09Z) - PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data.
We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.