Unsupervised Contrastive Learning of Sound Event Representations
- URL: http://arxiv.org/abs/2011.07616v1
- Date: Sun, 15 Nov 2020 19:50:14 GMT
- Title: Unsupervised Contrastive Learning of Sound Event Representations
- Authors: Eduardo Fonseca, Diego Ortego, Kevin McGuinness, Noel E. O'Connor,
Xavier Serra
- Abstract summary: Self-supervised representation learning can mitigate the limitations in recognition tasks with few manually labeled data but abundant unlabeled data.
In this work, we explore unsupervised contrastive learning as a way to learn sound event representations.
Our results suggest that unsupervised contrastive pre-training can mitigate the impact of data scarcity and increase robustness against noisy labels.
- Score: 30.914808451327403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised representation learning can mitigate the limitations in
recognition tasks with few manually labeled data but abundant unlabeled
data---a common scenario in sound event research. In this work, we explore
unsupervised contrastive learning as a way to learn sound event
representations. To this end, we propose to use the pretext task of contrasting
differently augmented views of sound events. The views are computed primarily
via mixing of training examples with unrelated backgrounds, followed by other
data augmentations. We analyze the main components of our method via ablation
experiments. We evaluate the learned representations using linear evaluation,
and in two in-domain downstream sound event classification tasks, namely, using
limited manually labeled data, and using noisy labeled data. Our results
suggest that unsupervised contrastive pre-training can mitigate the impact of
data scarcity and increase robustness against noisy labels, outperforming
supervised baselines.
Related papers
- Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Pretraining Representations for Bioacoustic Few-shot Detection using
Supervised Contrastive Learning [10.395255631261458]
In bioacoustic applications, most tasks come with few labelled training data, because annotating long recordings is time consuming and costly.
We show that learning a rich feature extractor from scratch can be achieved by leveraging data augmentation using a supervised contrastive learning framework.
We obtain an F-score of 63.46% on the validation set and 42.7% on the test set, ranking second in the DCASE challenge.
arXiv Detail & Related papers (2023-09-02T09:38:55Z) - Representation Learning for the Automatic Indexing of Sound Effects
Libraries [79.68916470119743]
We show that a task-specific but dataset-independent representation can successfully address data issues such as class imbalance, inconsistent class labels, and insufficient dataset size.
Detailed experimental results show the impact of metric learning approaches and different cross-dataset training methods on representational effectiveness.
arXiv Detail & Related papers (2022-08-18T23:46:13Z) - Context-based Virtual Adversarial Training for Text Classification with
Noisy Labels [1.9508698179748525]
We propose context-based virtual adversarial training (ConVAT) to prevent a text classifier from overfitting to noisy labels.
Unlike the previous works, the proposed method performs the adversarial training at the context level rather than the inputs.
We conduct extensive experiments on four text classification datasets with two types of label noises.
arXiv Detail & Related papers (2022-05-29T14:19:49Z) - Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video
Parsing [52.2231419645482]
This paper focuses on the weakly-supervised audio-visual video parsing task.
It aims to recognize all events belonging to each modality and localize their temporal boundaries.
arXiv Detail & Related papers (2022-04-25T11:41:17Z) - Augmented Contrastive Self-Supervised Learning for Audio Invariant
Representations [28.511060004984895]
We propose an augmented contrastive SSL framework to learn invariant representations from unlabeled data.
Our method applies various perturbations to the unlabeled input data and utilizes contrastive learning to learn representations robust to such perturbations.
arXiv Detail & Related papers (2021-12-21T02:50:53Z) - Combating Noise: Semi-supervised Learning by Region Uncertainty
Quantification [55.23467274564417]
Current methods are easily distracted by noisy regions generated by pseudo labels.
We propose noise-resistant semi-supervised learning by quantifying the region uncertainty.
Experiments on both PASCAL VOC and MS COCO demonstrate the extraordinary performance of our method.
arXiv Detail & Related papers (2021-11-01T13:23:42Z) - Cross-Referencing Self-Training Network for Sound Event Detection in
Audio Mixtures [23.568610919253352]
This paper proposes a semi-supervised method for generating pseudo-labels from unsupervised data using a student-teacher scheme that balances self-training and cross-training.
The results of these methods on both "validation" and "public evaluation" sets of DESED database show significant improvement compared to the state-of-the art systems in semi-supervised learning.
arXiv Detail & Related papers (2021-05-27T18:46:59Z) - Learning from Noisy Similar and Dissimilar Data [84.76686918337134]
We show how to learn a classifier from noisy S and D labeled data.
We also show important connections between learning from such pairwise supervision data and learning from ordinary class-labeled data.
arXiv Detail & Related papers (2020-02-03T19:59:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.