IDoFew: Intermediate Training Using Dual-Clustering in Language Models
for Few Labels Text Classification
- URL: http://arxiv.org/abs/2401.04025v1
- Date: Mon, 8 Jan 2024 17:07:37 GMT
- Title: IDoFew: Intermediate Training Using Dual-Clustering in Language Models
for Few Labels Text Classification
- Authors: Abdullah Alsuhaibani, Hamad Zogan, Imran Razzak, Shoaib Jameel,
Guandong Xu
- Abstract summary: Bidirectional Representations from Transformers (BERT) have been very effective in various Natural Language Processing (NLP) and text mining tasks including text classification.
Some tasks still pose challenges for these models, including text classification with limited labels.
We have developed a novel two-stage intermediate clustering with subsequent fine-tuning that models the pseudo-labels reliably.
- Score: 24.11420537250414
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language models such as Bidirectional Encoder Representations from
Transformers (BERT) have been very effective in various Natural Language
Processing (NLP) and text mining tasks including text classification. However,
some tasks still pose challenges for these models, including text
classification with limited labels. This can result in a cold-start problem.
Although some approaches have attempted to address this problem through
single-stage clustering as an intermediate training step coupled with a
pre-trained language model, which generates pseudo-labels to improve
classification, these methods are often error-prone due to the limitations of
the clustering algorithms. To overcome this, we have developed a novel
two-stage intermediate clustering with subsequent fine-tuning that models the
pseudo-labels reliably, resulting in reduced prediction errors. The key novelty
in our model, IDoFew, is that the two-stage clustering coupled with two
different clustering algorithms helps exploit the advantages of the
complementary algorithms that reduce the errors in generating reliable
pseudo-labels for fine-tuning. Our approach has shown significant improvements
compared to strong comparative models.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification [11.072083437769093]
We propose a novel model named SharpReCL for imbalanced text classification tasks.
Our model even outperforms popular large language models across several datasets.
arXiv Detail & Related papers (2024-05-19T11:33:49Z) - Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain
Adaptation Speaker Verification [17.284276598514502]
We propose a novel progressive subgraph clustering algorithm based on multi-model voting and double-Gaussian based assessment.
To prevent disastrous clustering results, we adopt an iterative approach that progressively increases k and employs a double-Gaussian based assessment algorithm.
arXiv Detail & Related papers (2023-05-22T04:26:18Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - Prototypical Calibration for Few-shot Learning of Language Models [84.5759596754605]
GPT-like models have been recognized as fragile across different hand-crafted templates, and demonstration permutations.
We propose prototypical calibration to adaptively learn a more robust decision boundary for zero- and few-shot classification.
Our method calibrates the decision boundary as expected, greatly improving the robustness of GPT to templates, permutations, and class imbalance.
arXiv Detail & Related papers (2022-05-20T13:50:07Z) - Adaptive label thresholding methods for online multi-label
classification [4.028101568570768]
Existing online multi-label classification works cannot handle the online label thresholding problem.
This paper proposes a novel framework of adaptive label thresholding algorithms for online multi-label classification.
arXiv Detail & Related papers (2021-12-04T10:34:09Z) - Multitask Learning for Class-Imbalanced Discourse Classification [74.41900374452472]
We show that a multitask approach can improve 7% Micro F1-score upon current state-of-the-art benchmarks.
We also offer a comparative review of additional techniques proposed to address resource-poor problems in NLP.
arXiv Detail & Related papers (2021-01-02T07:13:41Z) - Enhancement of Short Text Clustering by Iterative Classification [0.0]
iterative classification applies outlier removal to obtain outlier-free clusters.
It trains a classification algorithm using the non-outliers based on their cluster distributions.
By repeating this several times, we obtain a much improved clustering of texts.
arXiv Detail & Related papers (2020-01-31T02:12:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.