ESimCSE Unsupervised Contrastive Learning Jointly with UDA
Semi-Supervised Learning for Large Label System Text Classification Mode
- URL: http://arxiv.org/abs/2304.13140v1
- Date: Wed, 19 Apr 2023 03:44:23 GMT
- Title: ESimCSE Unsupervised Contrastive Learning Jointly with UDA
Semi-Supervised Learning for Large Label System Text Classification Mode
- Authors: Ruan Lu, Zhou HangCheng, Ran Meng, Zhao Jin, Qin JiaoYu, Wei Feng,
Wang ChenZi
- Abstract summary: The ESimCSE model efficiently learns text vector representations using unlabeled data to achieve better classification results.
UDA is trained using unlabeled data through semi-supervised learning methods to improve the prediction performance of the models and stability.
adversarial training techniques FGM and PGD are used in the model training process to improve the robustness and reliability of the model.
- Score: 4.708633772366381
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The challenges faced by text classification with large tag systems in natural
language processing tasks include multiple tag systems, uneven data
distribution, and high noise. To address these problems, the ESimCSE
unsupervised comparative learning and UDA semi-supervised comparative learning
models are combined through the use of joint training techniques in the
models.The ESimCSE model efficiently learns text vector representations using
unlabeled data to achieve better classification results, while UDA is trained
using unlabeled data through semi-supervised learning methods to improve the
prediction performance of the models and stability, and further improve the
generalization ability of the model. In addition, adversarial training
techniques FGM and PGD are used in the model training process to improve the
robustness and reliability of the model. The experimental results show that
there is an 8% and 10% accuracy improvement relative to Baseline on the public
dataset Ruesters as well as on the operational dataset, respectively, and a 15%
improvement in manual validation accuracy can be achieved on the operational
dataset, indicating that the method is effective.
Related papers
- Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - Data Augmentation for Sparse Multidimensional Learning Performance Data Using Generative AI [17.242331892899543]
Learning performance data describe correct and incorrect answers or problem-solving attempts in adaptive learning.
Learning performance data tend to be highly sparse (80%(sim)90% missing observations) in most real-world applications due to adaptive item selection.
This article proposes a systematic framework for augmenting learner data to address data sparsity in learning performance data.
arXiv Detail & Related papers (2024-09-24T00:25:07Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Collaboration of Teachers for Semi-supervised Object Detection [20.991741476731967]
We propose the Collaboration of Teachers Framework (CTF), which consists of multiple pairs of teacher and student models for training.
This framework greatly improves the utilization of unlabeled data and prevents the positive feedback cycle of unreliable pseudo-labels.
arXiv Detail & Related papers (2024-05-22T06:17:50Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Class-Aware Contrastive Semi-Supervised Learning [51.205844705156046]
We propose a general method named Class-aware Contrastive Semi-Supervised Learning (CCSSL) to improve pseudo-label quality and enhance the model's robustness in the real-world setting.
Our proposed CCSSL has significant performance improvements over the state-of-the-art SSL methods on the standard datasets CIFAR100 and STL10.
arXiv Detail & Related papers (2022-03-04T12:18:23Z) - Self Training with Ensemble of Teacher Models [8.257085583227695]
In order to train robust deep learning models, large amounts of labelled data is required.
In the absence of such large repositories of labelled data, unlabeled data can be exploited for the same.
Semi-Supervised learning aims to utilize such unlabeled data for training classification models.
arXiv Detail & Related papers (2021-07-17T09:44:09Z) - Ensemble Learning-Based Approach for Improving Generalization Capability
of Machine Reading Comprehension Systems [0.7614628596146599]
Machine Reading (MRC) is an active field in natural language processing with many successful developed models in recent years.
Despite their high in-distribution accuracy, these models suffer from two issues: high training cost and low out-of-distribution accuracy.
In this paper, we investigate the effect of ensemble learning approach to improve generalization of MRC systems without retraining a big model.
arXiv Detail & Related papers (2021-07-01T11:11:17Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.