Towards Robust Few-Shot Text Classification Using Transformer Architectures and Dual Loss Strategies
- URL: http://arxiv.org/abs/2505.06145v1
- Date: Fri, 09 May 2025 15:54:08 GMT
- Title: Towards Robust Few-Shot Text Classification Using Transformer Architectures and Dual Loss Strategies
- Authors: Xu Han, Yumeng Sun, Weiqiang Huang, Hongye Zheng, Junliang Du,
- Abstract summary: This paper proposes a strategy that combines adaptive fine-tuning, contrastive learning, and regularization optimization to improve the classification performance of Transformer-based models.<n>Experiments on the FewRel 2.0 dataset show that T5-small, DeBERTa-v3, and RoBERTa-base perform well in few-shot tasks.
- Score: 6.78820305740543
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot text classification has important application value in low-resource environments. This paper proposes a strategy that combines adaptive fine-tuning, contrastive learning, and regularization optimization to improve the classification performance of Transformer-based models. Experiments on the FewRel 2.0 dataset show that T5-small, DeBERTa-v3, and RoBERTa-base perform well in few-shot tasks, especially in the 5-shot setting, which can more effectively capture text features and improve classification accuracy. The experiment also found that there are significant differences in the classification difficulty of different relationship categories. Some categories have fuzzy semantic boundaries or complex feature distributions, making it difficult for the standard cross entropy loss to learn the discriminative information required to distinguish categories. By introducing contrastive loss and regularization loss, the generalization ability of the model is enhanced, effectively alleviating the overfitting problem in few-shot environments. In addition, the research results show that the use of Transformer models or generative architectures with stronger self-attention mechanisms can help improve the stability and accuracy of few-shot classification.
Related papers
- Intelligently Augmented Contrastive Tensor Factorization: Empowering Multi-dimensional Time Series Classification in Low-Data Environments [4.77513566805416]
We propose a versatile yet data-efficient framework, Intelligently Augmented Contrastive Factorization (ITA-CTF)<n>ITA-CTF module learns effective representations from multi-dimensional time series.<n>It incorporates a new contrastive loss optimization to similarity learning and class-awareness.<n>Compared to standard and several DL benchmarks, notable performance improvements up to 18.7% were achieved.
arXiv Detail & Related papers (2025-05-03T11:28:13Z) - Multi-Level Attention and Contrastive Learning for Enhanced Text Classification with an Optimized Transformer [0.0]
This paper studies a text classification algorithm based on an improved Transformer to improve the performance and efficiency of the model in text classification tasks.<n>The improved Transformer model outperforms the comparative models such as BiLSTM, CNN, standard Transformer, and BERT in terms of classification accuracy, F1 score, and recall rate.
arXiv Detail & Related papers (2025-01-23T08:32:27Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Balanced Classification: A Unified Framework for Long-Tailed Object
Detection [74.94216414011326]
Conventional detectors suffer from performance degradation when dealing with long-tailed data due to a classification bias towards the majority head categories.
We introduce a unified framework called BAlanced CLassification (BACL), which enables adaptive rectification of inequalities caused by disparities in category distribution.
BACL consistently achieves performance improvements across various datasets with different backbones and architectures.
arXiv Detail & Related papers (2023-08-04T09:11:07Z) - TART: Improved Few-shot Text Classification Using Task-Adaptive
Reference Transformation [23.02986307143718]
We propose a novel Task-Adaptive Reference Transformation (TART) network to enhance the generalization.
Our model surpasses the state-of-the-art method by 7.4% and 5.4% in 1-shot and 5-shot classification on the 20 Newsgroups dataset.
arXiv Detail & Related papers (2023-06-03T18:38:02Z) - Regularization Through Simultaneous Learning: A Case Study on Plant
Classification [0.0]
This paper introduces Simultaneous Learning, a regularization approach drawing on principles of Transfer Learning and Multi-task Learning.
We leverage auxiliary datasets with the target dataset, the UFOP-HVD, to facilitate simultaneous classification guided by a customized loss function.
Remarkably, our approach demonstrates superior performance over models without regularization.
arXiv Detail & Related papers (2023-05-22T19:44:57Z) - Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot
Image Classification [61.411869453639845]
We introduce a bi-reconstruction mechanism that can simultaneously accommodate for inter-class and intra-class variations.
This design effectively helps the model to explore more subtle and discriminative features.
Experimental results on three widely used fine-grained image classification datasets consistently show considerable improvements.
arXiv Detail & Related papers (2022-11-30T16:55:14Z) - Adaptive Meta-learner via Gradient Similarity for Few-shot Text
Classification [11.035878821365149]
We propose a novel Adaptive Meta-learner via Gradient Similarity (AMGS) to improve the model generalization ability to a new task.
Experimental results on several benchmarks demonstrate that the proposed AMGS consistently improves few-shot text classification performance.
arXiv Detail & Related papers (2022-09-10T16:14:53Z) - Ortho-Shot: Low Displacement Rank Regularization with Data Augmentation
for Few-Shot Learning [23.465747123791772]
In few-shot classification, the primary goal is to learn representations that generalize well for novel classes.
We propose an efficient low displacement rank (LDR) regularization strategy termed Ortho-Shot.
arXiv Detail & Related papers (2021-10-18T14:58:36Z) - Revisiting LSTM Networks for Semi-Supervised Text Classification via
Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results.
We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z) - Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples.
As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z) - Understanding and Diagnosing Vulnerability under Adversarial Attacks [62.661498155101654]
Deep Neural Networks (DNNs) are known to be vulnerable to adversarial attacks.
We propose a novel interpretability method, InterpretGAN, to generate explanations for features used for classification in latent variables.
We also design the first diagnostic method to quantify the vulnerability contributed by each layer.
arXiv Detail & Related papers (2020-07-17T01:56:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.