SCP-GAN: Self-Correcting Discriminator Optimization for Training
Consistency Preserving Metric GAN on Speech Enhancement Tasks
- URL: http://arxiv.org/abs/2210.14474v1
- Date: Wed, 26 Oct 2022 04:48:40 GMT
- Title: SCP-GAN: Self-Correcting Discriminator Optimization for Training
Consistency Preserving Metric GAN on Speech Enhancement Tasks
- Authors: Vasily Zadorozhnyy and Qiang Ye and Kazuhito Koishida
- Abstract summary: We introduce several improvements to the GAN training schemes, which can be applied to most GAN-based SE models.
We present self-correcting optimization for training a GAN discriminator on SE tasks, which helps avoid "harmful" training directions.
We have tested our proposed methods on several state-of-the-art GAN-based SE models and obtained consistent improvements.
- Score: 28.261911789087463
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, Generative Adversarial Networks (GANs) have produced
significantly improved results in speech enhancement (SE) tasks. They are
difficult to train, however. In this work, we introduce several improvements to
the GAN training schemes, which can be applied to most GAN-based SE models. We
propose using consistency loss functions, which target the inconsistency in
time and time-frequency domains caused by Fourier and Inverse Fourier
Transforms. We also present self-correcting optimization for training a GAN
discriminator on SE tasks, which helps avoid "harmful" training directions for
parts of the discriminator loss function. We have tested our proposed methods
on several state-of-the-art GAN-based SE models and obtained consistent
improvements, including new state-of-the-art results for the Voice Bank+DEMAND
dataset.
Related papers
- Private GANs, Revisited [16.570354461039603]
We show that the canonical approach for training differentially private GANs can yield significantly improved results after modifications to training.
We show that a simple fix -- taking more discriminator steps between generator steps -- restores parity between the generator and discriminator and improves results.
arXiv Detail & Related papers (2023-02-06T17:11:09Z) - Improving GANs with A Dynamic Discriminator [106.54552336711997]
We argue that a discriminator with an on-the-fly adjustment on its capacity can better accommodate such a time-varying task.
A comprehensive empirical study confirms that the proposed training strategy, termed as DynamicD, improves the synthesis performance without incurring any additional cost or training objectives.
arXiv Detail & Related papers (2022-09-20T17:57:33Z) - Speech Enhancement with Score-Based Generative Models in the Complex
STFT Domain [18.090665052145653]
We propose a novel training task for speech enhancement using a complex-valued deep neural network.
We derive this training task within the formalism of differential equations, thereby enabling the use of predictor-corrector samplers.
arXiv Detail & Related papers (2022-03-31T12:53:47Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Time-domain Speech Enhancement with Generative Adversarial Learning [53.74228907273269]
This paper proposes a new framework called Time-domain Speech Enhancement Generative Adversarial Network (TSEGAN)
TSEGAN is an extension of the generative adversarial network (GAN) in time-domain with metric evaluation to mitigate the scaling problem.
In addition, we provide a new method based on objective function mapping for the theoretical analysis of the performance of Metric GAN.
arXiv Detail & Related papers (2021-03-30T08:09:49Z) - Training GANs with Stronger Augmentations via Contrastive Discriminator [80.8216679195]
We introduce a contrastive representation learning scheme into the GAN discriminator, coined ContraD.
This "fusion" enables the discriminators to work with much stronger augmentations without increasing their training instability.
Our experimental results show that GANs with ContraD consistently improve FID and IS compared to other recent techniques incorporating data augmentations.
arXiv Detail & Related papers (2021-03-17T16:04:54Z) - Improving GAN Training with Probability Ratio Clipping and Sample
Reweighting [145.5106274085799]
generative adversarial networks (GANs) often suffer from inferior performance due to unstable training.
We propose a new variational GAN training framework which enjoys superior training stability.
By plugging the training approach in diverse state-of-the-art GAN architectures, we obtain significantly improved performance over a range of tasks.
arXiv Detail & Related papers (2020-06-12T01:39:48Z) - Stabilizing Training of Generative Adversarial Nets via Langevin Stein
Variational Gradient Descent [11.329376606876101]
We propose to stabilize GAN training via a novel particle-based variational inference -- Langevin Stein variational descent gradient (LSVGD)
We show that LSVGD dynamics has an implicit regularization which is able to enhance particles' spread-out and diversity.
arXiv Detail & Related papers (2020-04-22T11:20:04Z) - Feature Quantization Improves GAN Training [126.02828112121874]
Feature Quantization (FQ) for the discriminator embeds both true and fake data samples into a shared discrete space.
Our method can be easily plugged into existing GAN models, with little computational overhead in training.
arXiv Detail & Related papers (2020-04-05T04:06:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.