Incremental Learning Algorithm for Sound Event Detection
- URL: http://arxiv.org/abs/2003.12175v1
- Date: Thu, 26 Mar 2020 22:32:11 GMT
- Title: Incremental Learning Algorithm for Sound Event Detection
- Authors: Eunjeong Koh, Fatemeh Saki, Yinyi Guo, Cheng-Yu Hung, Erik Visser
- Abstract summary: This paper presents a new learning strategy for the Sound Event Detection (SED) system to tackle the issues of i) knowledge migration from a pre-trained model to a new target model and ii) learning new sound events without forgetting the previously learned ones without re-training from scratch.
In order to migrate the previously learned knowledge from the source model to the target one, a neural adapter is employed on the top of the source model.
The neural adapter layer facilitates the target model to learn new sound events with minimal training data and maintaining the performance of the previously learned sound events similar to the source model.
- Score: 0.8399688944263841
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a new learning strategy for the Sound Event Detection
(SED) system to tackle the issues of i) knowledge migration from a pre-trained
model to a new target model and ii) learning new sound events without
forgetting the previously learned ones without re-training from scratch. In
order to migrate the previously learned knowledge from the source model to the
target one, a neural adapter is employed on the top of the source model. The
source model and the target model are merged via this neural adapter layer. The
neural adapter layer facilitates the target model to learn new sound events
with minimal training data and maintaining the performance of the previously
learned sound events similar to the source model. Our extensive analysis on the
DCASE16 and US-SED dataset reveals the effectiveness of the proposed method in
transferring knowledge between source and target models without introducing any
performance degradation on the previously learned sound events while obtaining
a competitive detection performance on the newly learned sound events.
Related papers
- Transfer Learning for Passive Sonar Classification using Pre-trained Audio and ImageNet Models [39.85805843651649]
This study compares pre-trained Audio Neural Networks (PANNs) and ImageNet pre-trained models.
It was observed that the ImageNet pre-trained models slightly out-perform pre-trained audio models in passive sonar classification.
arXiv Detail & Related papers (2024-09-20T20:13:45Z) - BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND)
Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model.
Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - BSDP: Brain-inspired Streaming Dual-level Perturbations for Online Open
World Object Detection [31.467501311528498]
We aim to make deep learning models simulate the way people learn.
Existing OWOD approaches pay more attention to the identification of unknown categories, while the incremental learning part is also very important.
In this paper, we take the dual-level information of old samples as perturbations on new samples to make the model good at learning new knowledge without forgetting the old knowledge.
arXiv Detail & Related papers (2024-03-05T04:00:50Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective.
Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process.
During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z) - Low-Resource Music Genre Classification with Cross-Modal Neural Model
Reprogramming [129.4950757742912]
We introduce a novel method for leveraging pre-trained models for low-resource (music) classification based on the concept of Neural Model Reprogramming (NMR)
NMR aims at re-purposing a pre-trained model from a source domain to a target domain by modifying the input of a frozen pre-trained model.
Experimental results suggest that a neural model pre-trained on large-scale datasets can successfully perform music genre classification by using this reprogramming method.
arXiv Detail & Related papers (2022-11-02T17:38:33Z) - Matching Text and Audio Embeddings: Exploring Transfer-learning
Strategies for Language-based Audio Retrieval [11.161404854726348]
We present an analysis of large-scale pretrained deep learning models used for cross-modal (text-to-audio) retrieval.
We use embeddings extracted by these models in a metric learning framework to connect matching pairs of audio and text.
arXiv Detail & Related papers (2022-10-06T11:45:14Z) - Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models [89.44031286278347]
We propose a Hub-Pathway framework to enable knowledge transfer from a model hub.
The proposed framework can be trained end-to-end with the target task-specific loss.
Experiment results on computer vision and reinforcement learning tasks demonstrate that the framework achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-06-08T08:00:12Z) - Two-Level Residual Distillation based Triple Network for Incremental
Object Detection [21.725878050355824]
We propose a novel incremental object detector based on Faster R-CNN to continuously learn from new object classes without using old data.
It is a triple network where an old model and a residual model as assistants for helping the incremental model learning on new classes without forgetting the previous learned knowledge.
arXiv Detail & Related papers (2020-07-27T11:04:57Z) - Active Learning for Sound Event Detection [18.750572243562576]
This paper proposes an active learning system for sound event detection (SED)
It aims at maximizing the accuracy of a learned SED model with limited annotation effort.
Remarkably, the required annotation effort can be greatly reduced on the dataset where target sound events are rare.
arXiv Detail & Related papers (2020-02-12T14:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.