Data is Overrated: Perceptual Metrics Can Lead Learning in the Absence
of Training Data
- URL: http://arxiv.org/abs/2312.03455v1
- Date: Wed, 6 Dec 2023 12:27:25 GMT
- Title: Data is Overrated: Perceptual Metrics Can Lead Learning in the Absence
of Training Data
- Authors: Tashi Namgyal, Alexander Hepburn, Raul Santos-Rodriguez, Valero
Laparra, Jesus Malo
- Abstract summary: Perceptual metrics are traditionally used to evaluate the quality of natural signals, such as images and audio.
We show that training with perceptual losses improves the reconstruction of spectrograms and re-synthesized audio at test time over models trained with a standard Euclidean loss.
- Score: 44.659718609385315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Perceptual metrics are traditionally used to evaluate the quality of natural
signals, such as images and audio. They are designed to mimic the perceptual
behaviour of human observers and usually reflect structures found in natural
signals. This motivates their use as loss functions for training generative
models such that models will learn to capture the structure held in the metric.
We take this idea to the extreme in the audio domain by training a compressive
autoencoder to reconstruct uniform noise, in lieu of natural data. We show that
training with perceptual losses improves the reconstruction of spectrograms and
re-synthesized audio at test time over models trained with a standard Euclidean
loss. This demonstrates better generalisation to unseen natural signals when
using perceptual metrics.
Related papers
- The Effect of Perceptual Metrics on Music Representation Learning for Genre Classification [42.14708549155406]
We show that models trained with perceptual metrics as loss functions can capture perceptually meaningful features.
We demonstrate that using features extracted from autoencoders trained with perceptual losses can improve performance on music understanding tasks.
arXiv Detail & Related papers (2024-09-25T16:29:21Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Self-Supervised Learning for Audio-Based Emotion Recognition [1.7598252755538808]
Self-supervised learning is a family of methods which can learn despite a scarcity of supervised labels.
We have applied self-supervised learning pre-training to the classification of emotions from the CMU- MOSEI's acoustic modality.
We find that self-supervised learning consistently improves the performance of the model across all metrics.
arXiv Detail & Related papers (2023-07-23T14:40:50Z) - Dynamic Scheduled Sampling with Imitation Loss for Neural Text
Generation [10.306522595622651]
We introduce Dynamic Scheduled Sampling with Imitation Loss (DySI), which maintains the schedule based solely on the training time accuracy.
DySI achieves notable improvements on standard machine translation benchmarks, and significantly improves the robustness of other text generation models.
arXiv Detail & Related papers (2023-01-31T16:41:06Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z) - Applications of Koopman Mode Analysis to Neural Networks [52.77024349608834]
We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space.
We show how the Koopman spectrum can be used to determine the number of layers required for the architecture.
We also show how using Koopman modes we can selectively prune the network to speed up the training procedure.
arXiv Detail & Related papers (2020-06-21T11:00:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.