Improving Imbalanced Text Classification with Dynamic Curriculum
Learning
- URL: http://arxiv.org/abs/2210.14724v1
- Date: Tue, 25 Oct 2022 07:57:59 GMT
- Title: Improving Imbalanced Text Classification with Dynamic Curriculum
Learning
- Authors: Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
- Abstract summary: We propose a novel self-paced dynamic curriculum learning method for imbalanced text classification.
Our SPDCL can reorder and resample training data by difficulty criterion with an adaptive from easy to hard pace.
The experiments on several classification tasks show the effectiveness of SPDCL strategy, especially for the imbalanced dataset.
- Score: 32.731900584216724
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in pre-trained language models have improved the performance
for text classification tasks. However, little attention is paid to the
priority scheduling strategy on the samples during training. Humans acquire
knowledge gradually from easy to complex concepts, and the difficulty of the
same material can also vary significantly in different learning stages.
Inspired by this insights, we proposed a novel self-paced dynamic curriculum
learning (SPDCL) method for imbalanced text classification, which evaluates the
sample difficulty by both linguistic character and model capacity. Meanwhile,
rather than using static curriculum learning as in the existing research, our
SPDCL can reorder and resample training data by difficulty criterion with an
adaptive from easy to hard pace. The extensive experiments on several
classification tasks show the effectiveness of SPDCL strategy, especially for
the imbalanced dataset.
Related papers
- How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples.
We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics.
When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z) - A Psychology-based Unified Dynamic Framework for Curriculum Learning [5.410910735259908]
This paper presents a Psychology-based Unified Dynamic Framework for Curriculum Learning (PUDF)
We quantify the difficulty of training data by applying Item Response Theory (IRT) to responses from Artificial Crowds (AC)
We propose a Dynamic Data Selection via Model Ability Estimation (DDS-MAE) strategy to schedule the appropriate amount of data during model training.
arXiv Detail & Related papers (2024-08-09T20:30:37Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis [87.75833205560406]
This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system.
It does not require pooled data from all languages altogether, and thus alleviates the storage and computation burden.
arXiv Detail & Related papers (2021-10-09T07:00:38Z) - Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances.
Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z) - Active Learning for Sequence Tagging with Deep Pre-trained Models and
Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget.
We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework.
We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z) - Unsupervised neural adaptation model based on optimal transport for
spoken language identification [54.96267179988487]
Due to the mismatch of statistical distributions of acoustic speech between training and testing sets, the performance of spoken language identification (SLID) could be drastically degraded.
We propose an unsupervised neural adaptation model to deal with the distribution mismatch problem for SLID.
arXiv Detail & Related papers (2020-12-24T07:37:19Z) - Curriculum Learning with Diversity for Supervised Computer Vision Tasks [1.5229257192293197]
We introduce a novel curriculum sampling strategy which takes into consideration the diversity of the training data together with the difficulty of the inputs.
We prove that our strategy is very efficient for unbalanced data sets, leading to faster convergence and more accurate results.
arXiv Detail & Related papers (2020-09-22T15:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.