Related papers: Incremental Learning Algorithm for Sound Event Detection

Incremental Learning Algorithm for Sound Event Detection

URL: http://arxiv.org/abs/2003.12175v1
Date: Thu, 26 Mar 2020 22:32:11 GMT
Title: Incremental Learning Algorithm for Sound Event Detection
Authors: Eunjeong Koh, Fatemeh Saki, Yinyi Guo, Cheng-Yu Hung, Erik Visser
Abstract summary: This paper presents a new learning strategy for the Sound Event Detection (SED) system to tackle the issues of i) knowledge migration from a pre-trained model to a new target model and ii) learning new sound events without forgetting the previously learned ones without re-training from scratch. In order to migrate the previously learned knowledge from the source model to the target one, a neural adapter is employed on the top of the source model. The neural adapter layer facilitates the target model to learn new sound events with minimal training data and maintaining the performance of the previously learned sound events similar to the source model.
Score: 0.8399688944263841
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a new learning strategy for the Sound Event Detection (SED) system to tackle the issues of i) knowledge migration from a pre-trained model to a new target model and ii) learning new sound events without forgetting the previously learned ones without re-training from scratch. In order to migrate the previously learned knowledge from the source model to the target one, a neural adapter is employed on the top of the source model. The source model and the target model are merged via this neural adapter layer. The neural adapter layer facilitates the target model to learn new sound events with minimal training data and maintaining the performance of the previously learned sound events similar to the source model. Our extensive analysis on the DCASE16 and US-SED dataset reveals the effectiveness of the proposed method in transferring knowledge between source and target models without introducing any performance degradation on the previously learned sound events while obtaining a competitive detection performance on the newly learned sound events.

Related papers

Exploring Performance-Complexity Trade-Offs in Sound Event Detection [3.035039100561926]
We study the problem of developing new low-complexity networks for the sound event detection task. We find that low-complexity convolutional models previously proposed for audio tagging can be effectively adapted for event detection. We show that combined with an optimized training strategy, we can reach event detection performance comparable to state-of-the-art transformers.
arXiv Detail & Related papers (2025-03-14T13:18:02Z)
Transfer Learning for Passive Sonar Classification using Pre-trained Audio and ImageNet Models [39.85805843651649]
This study compares pre-trained Audio Neural Networks (PANNs) and ImageNet pre-trained models. It was observed that the ImageNet pre-trained models slightly out-perform pre-trained audio models in passive sonar classification.
arXiv Detail & Related papers (2024-09-20T20:13:45Z)
BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND) Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model. Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z)
Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets. We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z)
BSDP: Brain-inspired Streaming Dual-level Perturbations for Online Open World Object Detection [31.467501311528498]
We aim to make deep learning models simulate the way people learn. Existing OWOD approaches pay more attention to the identification of unknown categories, while the incremental learning part is also very important. In this paper, we take the dual-level information of old samples as perturbations on new samples to make the model good at learning new knowledge without forgetting the old knowledge.
arXiv Detail & Related papers (2024-03-05T04:00:50Z)
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks. We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z)
DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective. Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process. During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z)
Low-Resource Music Genre Classification with Cross-Modal Neural Model Reprogramming [129.4950757742912]
We introduce a novel method for leveraging pre-trained models for low-resource (music) classification based on the concept of Neural Model Reprogramming (NMR) NMR aims at re-purposing a pre-trained model from a source domain to a target domain by modifying the input of a frozen pre-trained model. Experimental results suggest that a neural model pre-trained on large-scale datasets can successfully perform music genre classification by using this reprogramming method.
arXiv Detail & Related papers (2022-11-02T17:38:33Z)
Matching Text and Audio Embeddings: Exploring Transfer-learning Strategies for Language-based Audio Retrieval [11.161404854726348]
We present an analysis of large-scale pretrained deep learning models used for cross-modal (text-to-audio) retrieval. We use embeddings extracted by these models in a metric learning framework to connect matching pairs of audio and text.
arXiv Detail & Related papers (2022-10-06T11:45:14Z)
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models [89.44031286278347]
We propose a Hub-Pathway framework to enable knowledge transfer from a model hub. The proposed framework can be trained end-to-end with the target task-specific loss. Experiment results on computer vision and reinforcement learning tasks demonstrate that the framework achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-06-08T08:00:12Z)
Two-Level Residual Distillation based Triple Network for Incremental Object Detection [21.725878050355824]
We propose a novel incremental object detector based on Faster R-CNN to continuously learn from new object classes without using old data. It is a triple network where an old model and a residual model as assistants for helping the incremental model learning on new classes without forgetting the previous learned knowledge.
arXiv Detail & Related papers (2020-07-27T11:04:57Z)
Active Learning for Sound Event Detection [18.750572243562576]
This paper proposes an active learning system for sound event detection (SED) It aims at maximizing the accuracy of a learned SED model with limited annotation effort. Remarkably, the required annotation effort can be greatly reduced on the dataset where target sound events are rare.
arXiv Detail & Related papers (2020-02-12T14:46:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.