Towards Robust Few-shot Class Incremental Learning in Audio Classification using Contrastive Representation
- URL: http://arxiv.org/abs/2407.19265v2
- Date: Wed, 7 Aug 2024 09:16:12 GMT
- Title: Towards Robust Few-shot Class Incremental Learning in Audio Classification using Contrastive Representation
- Authors: Riyansha Singh, Parinita Nema, Vinod K Kurmi,
- Abstract summary: Few-shot class-incremental learning addresses challenges arising from limited incoming data.
We propose supervised contrastive learning to refine the representation space, enhancing discriminative power and leading to better generalization.
- Score: 1.3586572110652484
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In machine learning applications, gradual data ingress is common, especially in audio processing where incremental learning is vital for real-time analytics. Few-shot class-incremental learning addresses challenges arising from limited incoming data. Existing methods often integrate additional trainable components or rely on a fixed embedding extractor post-training on base sessions to mitigate concerns related to catastrophic forgetting and the dangers of model overfitting. However, using cross-entropy loss alone during base session training is suboptimal for audio data. To address this, we propose incorporating supervised contrastive learning to refine the representation space, enhancing discriminative power and leading to better generalization since it facilitates seamless integration of incremental classes, upon arrival. Experimental results on NSynth and LibriSpeech datasets with 100 classes, as well as ESC dataset with 50 and 10 classes, demonstrate state-of-the-art performance.
Related papers
- The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning [86.19804569376333]
We show that zero-shot generalization happens very early during instruction tuning.
We propose a more grounded training data arrangement framework, Test-centric Multi-turn Arrangement.
arXiv Detail & Related papers (2024-06-17T16:40:21Z) - OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning [57.43911113915546]
Few-Shot Class-Incremental Learning (FSCIL) introduces a paradigm in which the problem space expands with limited data.
FSCIL methods inherently face the challenge of catastrophic forgetting as data arrives incrementally.
We propose the OrCo framework built on two core principles: features' orthogonality in the representation space, and contrastive learning.
arXiv Detail & Related papers (2024-03-27T13:30:48Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Representation Learning for the Automatic Indexing of Sound Effects
Libraries [79.68916470119743]
We show that a task-specific but dataset-independent representation can successfully address data issues such as class imbalance, inconsistent class labels, and insufficient dataset size.
Detailed experimental results show the impact of metric learning approaches and different cross-dataset training methods on representational effectiveness.
arXiv Detail & Related papers (2022-08-18T23:46:13Z) - Efficient Augmentation for Imbalanced Deep Learning [8.38844520504124]
We study a convolutional neural network's internal representation of imbalanced image data.
We measure the generalization gap between a model's feature embeddings in the training and test sets, showing that the gap is wider for minority classes.
This insight enables us to design an efficient three-phase CNN training framework for imbalanced data.
arXiv Detail & Related papers (2022-07-13T09:43:17Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Few-Shot Incremental Learning with Continually Evolved Classifiers [46.278573301326276]
Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points.
The difficulty lies in that limited data from new classes not only lead to significant overfitting issues but also exacerbate the notorious catastrophic forgetting problems.
We propose a Continually Evolved CIF ( CEC) that employs a graph model to propagate context information between classifiers for adaptation.
arXiv Detail & Related papers (2021-04-07T10:54:51Z) - Searching for Robustness: Loss Learning for Noisy Classification Tasks [81.70914107917551]
We parameterize a flexible family of loss functions using Taylors and apply evolutionary strategies to search for noise-robust losses in this space.
The resulting white-box loss provides a simple and fast "plug-and-play" module that enables effective noise-robust learning in diverse downstream tasks.
arXiv Detail & Related papers (2021-02-27T15:27:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.