Related papers: Density-Based Dynamic Curriculum Learning for Intent Detection

Density-Based Dynamic Curriculum Learning for Intent Detection

URL: http://arxiv.org/abs/2108.10674v1
Date: Tue, 24 Aug 2021 12:29:26 GMT
Title: Density-Based Dynamic Curriculum Learning for Intent Detection
Authors: Yantao Gong, Cao Liu, Jiazhen Yuan, Fan Yang, Xunliang Cai, Guanglu Wan, Jiansong Chen, Ruiyao Niu and Houfeng Wang
Abstract summary: Our model defines the sample's difficulty level according to their eigenvectors' density. We apply a dynamic curriculum learning strategy, which pays distinct attention to samples of various difficulty levels. Experiments on three open datasets verify that the proposed density-based algorithm can distinguish simple and complex samples significantly.
Score: 14.653917644725427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pre-trained language models have achieved noticeable performance on the intent detection task. However, due to assigning an identical weight to each sample, they suffer from the overfitting of simple samples and the failure to learn complex samples well. To handle this problem, we propose a density-based dynamic curriculum learning model. Our model defines the sample's difficulty level according to their eigenvectors' density. In this way, we exploit the overall distribution of all samples' eigenvectors simultaneously. Then we apply a dynamic curriculum learning strategy, which pays distinct attention to samples of various difficulty levels and alters the proportion of samples during the training process. Through the above operation, simple samples are well-trained, and complex samples are enhanced. Experiments on three open datasets verify that the proposed density-based algorithm can distinguish simple and complex samples significantly. Besides, our model obtains obvious improvement over the strong baselines.

Related papers

Nearly Optimal Sample Complexity for Learning with Label Proportions [54.67830198790247]
We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags.<n>Despite the partial observability, the goal is still to achieve small regret at the level of individual examples.<n>We give results on the sample complexity of LLP under square loss, showing that our sample complexity is essentially optimal.
arXiv Detail & Related papers (2025-05-08T15:45:23Z)
Preview-based Category Contrastive Learning for Knowledge Distillation [53.551002781828146]
We propose a novel preview-based category contrastive learning method for knowledge distillation (PCKD) It first distills the structural knowledge of both instance-level feature correspondence and the relation between instance features and category centers. It can explicitly optimize the category representation and explore the distinct correlation between representations of instances and categories.
arXiv Detail & Related papers (2024-10-18T03:31:00Z)
Detection of Under-represented Samples Using Dynamic Batch Training for Brain Tumor Segmentation from MR Images [0.8437187555622164]
Brain tumors in magnetic resonance imaging (MR) are difficult, time-consuming, and prone to human error. These challenges can be resolved by developing automatic brain tumor segmentation methods from MR images. Various deep-learning models based on the U-Net have been proposed for the task. These deep-learning models are trained on a dataset of tumor images and then used for segmenting the masks.
arXiv Detail & Related papers (2024-08-21T21:51:47Z)
Data Pruning via Moving-one-Sample-out [61.45441981346064]
We propose a novel data-pruning approach called moving-one-sample-out (MoSo) MoSo aims to identify and remove the least informative samples from the training set. Experimental results demonstrate that MoSo effectively mitigates severe performance degradation at high pruning ratios.
arXiv Detail & Related papers (2023-10-23T08:00:03Z)
Hard Sample Aware Network for Contrastive Deep Graph Clustering [38.44763843990694]
We propose a novel contrastive deep graph clustering method dubbed Hard Sample Aware Network (HSAN) In our algorithm, the similarities between samples are calculated by considering both the attribute embeddings and the structure embeddings. Under the guidance of the carefully collected high-confidence clustering information, our proposed weight modulating function will first recognize the positive and negative samples.
arXiv Detail & Related papers (2022-12-16T16:57:37Z)
DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance. To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z)
Style Curriculum Learning for Robust Medical Image Segmentation [62.02435329931057]
Deep segmentation models often degrade due to distribution shifts in image intensities between the training and test data sets. We propose a novel framework to ensure robust segmentation in the presence of such distribution shifts.
arXiv Detail & Related papers (2021-08-01T08:56:24Z)
Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency) Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z)
One for More: Selecting Generalizable Samples for Generalizable ReID Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function. Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z)
Optimal Importance Sampling for Federated Learning [57.14673504239551]
Federated learning involves a mixture of centralized and decentralized processing tasks. The sampling of both agents and data is generally uniform; however, in this work we consider non-uniform sampling. We derive optimal importance sampling strategies for both agent and data selection and show that non-uniform sampling without replacement improves the performance of the original FedAvg algorithm.
arXiv Detail & Related papers (2020-10-26T14:15:33Z)
Minority Class Oversampling for Tabular Data with Deep Generative Models [4.976007156860967]
We study the ability of deep generative models to provide realistic samples that improve performance on imbalanced classification tasks via oversampling. Our experiments show that the way the method of sampling does not affect quality, but runtime varies widely. We also observe that the improvements in terms of performance metric, while shown to be significant, often are minor in absolute terms.
arXiv Detail & Related papers (2020-05-07T21:35:57Z)
Efficient Deep Representation Learning by Adaptive Latent Space Sampling [16.320898678521843]
Supervised deep learning requires a large amount of training samples with annotations, which are expensive and time-consuming to obtain. We propose a novel training framework which adaptively selects informative samples that are fed to the training process.
arXiv Detail & Related papers (2020-03-19T22:17:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.