Density-Based Dynamic Curriculum Learning for Intent Detection
- URL: http://arxiv.org/abs/2108.10674v1
- Date: Tue, 24 Aug 2021 12:29:26 GMT
- Title: Density-Based Dynamic Curriculum Learning for Intent Detection
- Authors: Yantao Gong, Cao Liu, Jiazhen Yuan, Fan Yang, Xunliang Cai, Guanglu
Wan, Jiansong Chen, Ruiyao Niu and Houfeng Wang
- Abstract summary: Our model defines the sample's difficulty level according to their eigenvectors' density.
We apply a dynamic curriculum learning strategy, which pays distinct attention to samples of various difficulty levels.
Experiments on three open datasets verify that the proposed density-based algorithm can distinguish simple and complex samples significantly.
- Score: 14.653917644725427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models have achieved noticeable performance on the
intent detection task. However, due to assigning an identical weight to each
sample, they suffer from the overfitting of simple samples and the failure to
learn complex samples well. To handle this problem, we propose a density-based
dynamic curriculum learning model. Our model defines the sample's difficulty
level according to their eigenvectors' density. In this way, we exploit the
overall distribution of all samples' eigenvectors simultaneously. Then we apply
a dynamic curriculum learning strategy, which pays distinct attention to
samples of various difficulty levels and alters the proportion of samples
during the training process. Through the above operation, simple samples are
well-trained, and complex samples are enhanced. Experiments on three open
datasets verify that the proposed density-based algorithm can distinguish
simple and complex samples significantly. Besides, our model obtains obvious
improvement over the strong baselines.
Related papers
- Preview-based Category Contrastive Learning for Knowledge Distillation [53.551002781828146]
We propose a novel preview-based category contrastive learning method for knowledge distillation (PCKD)
It first distills the structural knowledge of both instance-level feature correspondence and the relation between instance features and category centers.
It can explicitly optimize the category representation and explore the distinct correlation between representations of instances and categories.
arXiv Detail & Related papers (2024-10-18T03:31:00Z) - Detection of Under-represented Samples Using Dynamic Batch Training for Brain Tumor Segmentation from MR Images [0.8437187555622164]
Brain tumors in magnetic resonance imaging (MR) are difficult, time-consuming, and prone to human error.
These challenges can be resolved by developing automatic brain tumor segmentation methods from MR images.
Various deep-learning models based on the U-Net have been proposed for the task.
These deep-learning models are trained on a dataset of tumor images and then used for segmenting the masks.
arXiv Detail & Related papers (2024-08-21T21:51:47Z) - Data Pruning via Moving-one-Sample-out [61.45441981346064]
We propose a novel data-pruning approach called moving-one-sample-out (MoSo)
MoSo aims to identify and remove the least informative samples from the training set.
Experimental results demonstrate that MoSo effectively mitigates severe performance degradation at high pruning ratios.
arXiv Detail & Related papers (2023-10-23T08:00:03Z) - Hard Sample Aware Network for Contrastive Deep Graph Clustering [38.44763843990694]
We propose a novel contrastive deep graph clustering method dubbed Hard Sample Aware Network (HSAN)
In our algorithm, the similarities between samples are calculated by considering both the attribute embeddings and the structure embeddings.
Under the guidance of the carefully collected high-confidence clustering information, our proposed weight modulating function will first recognize the positive and negative samples.
arXiv Detail & Related papers (2022-12-16T16:57:37Z) - DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples
Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance.
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z) - Style Curriculum Learning for Robust Medical Image Segmentation [62.02435329931057]
Deep segmentation models often degrade due to distribution shifts in image intensities between the training and test data sets.
We propose a novel framework to ensure robust segmentation in the presence of such distribution shifts.
arXiv Detail & Related papers (2021-08-01T08:56:24Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - One for More: Selecting Generalizable Samples for Generalizable ReID
Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function.
Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z) - Optimal Importance Sampling for Federated Learning [57.14673504239551]
Federated learning involves a mixture of centralized and decentralized processing tasks.
The sampling of both agents and data is generally uniform; however, in this work we consider non-uniform sampling.
We derive optimal importance sampling strategies for both agent and data selection and show that non-uniform sampling without replacement improves the performance of the original FedAvg algorithm.
arXiv Detail & Related papers (2020-10-26T14:15:33Z) - Minority Class Oversampling for Tabular Data with Deep Generative Models [4.976007156860967]
We study the ability of deep generative models to provide realistic samples that improve performance on imbalanced classification tasks via oversampling.
Our experiments show that the way the method of sampling does not affect quality, but runtime varies widely.
We also observe that the improvements in terms of performance metric, while shown to be significant, often are minor in absolute terms.
arXiv Detail & Related papers (2020-05-07T21:35:57Z) - Efficient Deep Representation Learning by Adaptive Latent Space Sampling [16.320898678521843]
Supervised deep learning requires a large amount of training samples with annotations, which are expensive and time-consuming to obtain.
We propose a novel training framework which adaptively selects informative samples that are fed to the training process.
arXiv Detail & Related papers (2020-03-19T22:17:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.