Density-Based Dynamic Curriculum Learning for Intent Detection
- URL: http://arxiv.org/abs/2108.10674v1
- Date: Tue, 24 Aug 2021 12:29:26 GMT
- Title: Density-Based Dynamic Curriculum Learning for Intent Detection
- Authors: Yantao Gong, Cao Liu, Jiazhen Yuan, Fan Yang, Xunliang Cai, Guanglu
Wan, Jiansong Chen, Ruiyao Niu and Houfeng Wang
- Abstract summary: Our model defines the sample's difficulty level according to their eigenvectors' density.
We apply a dynamic curriculum learning strategy, which pays distinct attention to samples of various difficulty levels.
Experiments on three open datasets verify that the proposed density-based algorithm can distinguish simple and complex samples significantly.
- Score: 14.653917644725427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models have achieved noticeable performance on the
intent detection task. However, due to assigning an identical weight to each
sample, they suffer from the overfitting of simple samples and the failure to
learn complex samples well. To handle this problem, we propose a density-based
dynamic curriculum learning model. Our model defines the sample's difficulty
level according to their eigenvectors' density. In this way, we exploit the
overall distribution of all samples' eigenvectors simultaneously. Then we apply
a dynamic curriculum learning strategy, which pays distinct attention to
samples of various difficulty levels and alters the proportion of samples
during the training process. Through the above operation, simple samples are
well-trained, and complex samples are enhanced. Experiments on three open
datasets verify that the proposed density-based algorithm can distinguish
simple and complex samples significantly. Besides, our model obtains obvious
improvement over the strong baselines.
Related papers
- Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - Data Pruning via Moving-one-Sample-out [61.45441981346064]
We propose a novel data-pruning approach called moving-one-sample-out (MoSo)
MoSo aims to identify and remove the least informative samples from the training set.
Experimental results demonstrate that MoSo effectively mitigates severe performance degradation at high pruning ratios.
arXiv Detail & Related papers (2023-10-23T08:00:03Z) - Hard Sample Aware Network for Contrastive Deep Graph Clustering [38.44763843990694]
We propose a novel contrastive deep graph clustering method dubbed Hard Sample Aware Network (HSAN)
In our algorithm, the similarities between samples are calculated by considering both the attribute embeddings and the structure embeddings.
Under the guidance of the carefully collected high-confidence clustering information, our proposed weight modulating function will first recognize the positive and negative samples.
arXiv Detail & Related papers (2022-12-16T16:57:37Z) - DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples
Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance.
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z) - Reweighted Manifold Learning of Collective Variables from Enhanced Sampling Simulations [2.6009298669020477]
We provide a framework based on anisotropic diffusion maps for manifold learning.
We show that our framework reverts the biasing effect yielding CVs that correctly describe the equilibrium density.
We show that it can be used in many manifold learning techniques on data from both standard and enhanced sampling simulations.
arXiv Detail & Related papers (2022-07-29T08:59:56Z) - Style Curriculum Learning for Robust Medical Image Segmentation [62.02435329931057]
Deep segmentation models often degrade due to distribution shifts in image intensities between the training and test data sets.
We propose a novel framework to ensure robust segmentation in the presence of such distribution shifts.
arXiv Detail & Related papers (2021-08-01T08:56:24Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - One for More: Selecting Generalizable Samples for Generalizable ReID
Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function.
Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z) - Optimal Importance Sampling for Federated Learning [57.14673504239551]
Federated learning involves a mixture of centralized and decentralized processing tasks.
The sampling of both agents and data is generally uniform; however, in this work we consider non-uniform sampling.
We derive optimal importance sampling strategies for both agent and data selection and show that non-uniform sampling without replacement improves the performance of the original FedAvg algorithm.
arXiv Detail & Related papers (2020-10-26T14:15:33Z) - Minority Class Oversampling for Tabular Data with Deep Generative Models [4.976007156860967]
We study the ability of deep generative models to provide realistic samples that improve performance on imbalanced classification tasks via oversampling.
Our experiments show that the way the method of sampling does not affect quality, but runtime varies widely.
We also observe that the improvements in terms of performance metric, while shown to be significant, often are minor in absolute terms.
arXiv Detail & Related papers (2020-05-07T21:35:57Z) - Efficient Deep Representation Learning by Adaptive Latent Space Sampling [16.320898678521843]
Supervised deep learning requires a large amount of training samples with annotations, which are expensive and time-consuming to obtain.
We propose a novel training framework which adaptively selects informative samples that are fed to the training process.
arXiv Detail & Related papers (2020-03-19T22:17:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.