Improving Imbalanced Text Classification with Dynamic Curriculum
Learning
- URL: http://arxiv.org/abs/2210.14724v1
- Date: Tue, 25 Oct 2022 07:57:59 GMT
- Title: Improving Imbalanced Text Classification with Dynamic Curriculum
Learning
- Authors: Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
- Abstract summary: We propose a novel self-paced dynamic curriculum learning method for imbalanced text classification.
Our SPDCL can reorder and resample training data by difficulty criterion with an adaptive from easy to hard pace.
The experiments on several classification tasks show the effectiveness of SPDCL strategy, especially for the imbalanced dataset.
- Score: 32.731900584216724
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in pre-trained language models have improved the performance
for text classification tasks. However, little attention is paid to the
priority scheduling strategy on the samples during training. Humans acquire
knowledge gradually from easy to complex concepts, and the difficulty of the
same material can also vary significantly in different learning stages.
Inspired by this insights, we proposed a novel self-paced dynamic curriculum
learning (SPDCL) method for imbalanced text classification, which evaluates the
sample difficulty by both linguistic character and model capacity. Meanwhile,
rather than using static curriculum learning as in the existing research, our
SPDCL can reorder and resample training data by difficulty criterion with an
adaptive from easy to hard pace. The extensive experiments on several
classification tasks show the effectiveness of SPDCL strategy, especially for
the imbalanced dataset.
Related papers
- One-Shot Learning as Instruction Data Prospector for Large Language Models [108.81681547472138]
textscNuggets uses one-shot learning to select high-quality instruction data from extensive datasets.
We show that instruction tuning with the top 1% of examples curated by textscNuggets substantially outperforms conventional methods employing the entire dataset.
arXiv Detail & Related papers (2023-12-16T03:33:12Z) - Influence Scores at Scale for Efficient Language Data Sampling [3.072340427031969]
"influence scores" are used to identify important subsets of data.
In this paper, we explore the applicability of influence scores in language classification tasks.
arXiv Detail & Related papers (2023-11-27T20:19:22Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis [87.75833205560406]
This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system.
It does not require pooled data from all languages altogether, and thus alleviates the storage and computation burden.
arXiv Detail & Related papers (2021-10-09T07:00:38Z) - Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances.
Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z) - Active Learning for Sequence Tagging with Deep Pre-trained Models and
Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget.
We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework.
We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z) - Unsupervised neural adaptation model based on optimal transport for
spoken language identification [54.96267179988487]
Due to the mismatch of statistical distributions of acoustic speech between training and testing sets, the performance of spoken language identification (SLID) could be drastically degraded.
We propose an unsupervised neural adaptation model to deal with the distribution mismatch problem for SLID.
arXiv Detail & Related papers (2020-12-24T07:37:19Z) - Dynamic Data Selection for Curriculum Learning via Ability Estimation [6.255759848576057]
We propose replacing difficultys with learned difficulty parameters.
We also propose Dynamic selection for Curriculum Learning via Ability Estimation.
We show that models using learned difficulty and/or ability outperform data-based curriculum learning models on the GLUE classification tasks.
arXiv Detail & Related papers (2020-10-30T20:01:56Z) - Curriculum Learning with Diversity for Supervised Computer Vision Tasks [1.5229257192293197]
We introduce a novel curriculum sampling strategy which takes into consideration the diversity of the training data together with the difficulty of the inputs.
We prove that our strategy is very efficient for unbalanced data sets, leading to faster convergence and more accurate results.
arXiv Detail & Related papers (2020-09-22T15:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.