Do sound event representations generalize to other audio tasks? A case
study in audio transfer learning
- URL: http://arxiv.org/abs/2106.11335v1
- Date: Mon, 21 Jun 2021 18:04:59 GMT
- Title: Do sound event representations generalize to other audio tasks? A case
study in audio transfer learning
- Authors: Anurag Kumar, Yun Wang, Vamsi Krishna Ithapu, Christian Fuegen
- Abstract summary: In this paper, we investigate transfer learning capacity of audio representations obtained from neural networks trained on a large-scale sound event detection dataset.
We show that such simple linear transfer is already powerful enough to achieve high performance on the downstream tasks.
- Score: 20.572846660950812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning is critical for efficient information transfer across
multiple related learning problems. A simple, yet effective transfer learning
approach utilizes deep neural networks trained on a large-scale task for
feature extraction. Such representations are then used to learn related
downstream tasks. In this paper, we investigate transfer learning capacity of
audio representations obtained from neural networks trained on a large-scale
sound event detection dataset. We build and evaluate these representations
across a wide range of other audio tasks, via a simple linear classifier
transfer mechanism. We show that such simple linear transfer is already
powerful enough to achieve high performance on the downstream tasks. We also
provide insights into the attributes of sound event representations that enable
such efficient information transfer.
Related papers
- Leveraging Visual Supervision for Array-based Active Speaker Detection
and Localization [3.836171323110284]
We show that a simple audio convolutional recurrent neural network can perform simultaneous horizontal active speaker detection and localization.
We propose a new self-supervised training pipeline that embraces a student-teacher'' learning approach.
arXiv Detail & Related papers (2023-12-21T16:53:04Z) - Evaluating the structure of cognitive tasks with transfer learning [67.22168759751541]
This study investigates the transferability of deep learning representations between different EEG decoding tasks.
We conduct extensive experiments using state-of-the-art decoding models on two recently released EEG datasets.
arXiv Detail & Related papers (2023-07-28T14:51:09Z) - Learning General Audio Representations with Large-Scale Training of
Patchout Audio Transformers [6.002503434201551]
We study the use of audio transformers trained on large-scale datasets to learn general-purpose representations.
Our results show that representations extracted by audio transformers outperform CNN representations.
arXiv Detail & Related papers (2022-11-25T08:39:12Z) - An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding [65.75873687351553]
This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain.
Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
arXiv Detail & Related papers (2022-10-21T04:36:46Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Self-supervised Audiovisual Representation Learning for Remote Sensing Data [96.23611272637943]
We propose a self-supervised approach for pre-training deep neural networks in remote sensing.
By exploiting the correspondence between geo-tagged audio recordings and remote sensing, this is done in a completely label-free manner.
We show that our approach outperforms existing pre-training strategies for remote sensing imagery.
arXiv Detail & Related papers (2021-08-02T07:50:50Z) - What is being transferred in transfer learning? [51.6991244438545]
We show that when training from pre-trained weights, the model stays in the same basin in the loss landscape.
We present that when training from pre-trained weights, the model stays in the same basin in the loss landscape and different instances of such model are similar in feature space and close in parameter space.
arXiv Detail & Related papers (2020-08-26T17:23:40Z) - A Transfer Learning Method for Speech Emotion Recognition from Automatic
Speech Recognition [0.0]
We show a transfer learning method in speech emotion recognition based on a Time-Delay Neural Network architecture.
We achieve the highest significantly higher accuracy when compared to state-of-the-art, using five-fold cross validation.
arXiv Detail & Related papers (2020-08-06T20:37:22Z) - Anti-Transfer Learning for Task Invariance in Convolutional Neural
Networks for Speech Processing [6.376852004129252]
We introduce the novel concept of anti-transfer learning for speech processing with convolutional neural networks.
We show that anti-transfer actually leads to the intended invariance to the task and to more appropriate features for the target task at hand.
arXiv Detail & Related papers (2020-06-11T15:03:29Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z) - Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep
Character Recognition [2.320417845168326]
Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models.
The technique of pre-training on one task and then retraining on a new one is called transfer learning.
In this paper we analyse the effectiveness of using deep transfer learning for character recognition tasks.
arXiv Detail & Related papers (2020-01-02T14:18:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.