Related papers: Non-saturating GAN training as divergence minimization

Related papers

Theory-Informed Improvements to Classifier-Free Guidance for Discrete Diffusion Models [24.186262549509102]
This paper theoretically analyzes CFG in the context of masked discrete diffusion.<n>High guidance early in sampling (when inputs are heavily masked) harms generation quality, while late-stage guidance has a larger effect.<n>Our method smoothens the transport between the data distribution and the initial (masked/uniform) distribution, which results in improved sample quality.
arXiv Detail & Related papers (2025-07-11T18:48:29Z)
Gradient Extrapolation for Debiased Representation Learning [7.183424522250937]
Gradient Extrapolation for Debiased Representation Learning (GERNE) is designed to learn debiased representations in both known and unknown attribute training cases. GERNE can serve as a general framework for debiasing with methods, such as ERM, reweighting, and resampling, being shown as special cases. The proposed approach is validated on five vision and one NLP benchmarks, demonstrating competitive and often superior performance compared to state-of-the-art baseline methods.
arXiv Detail & Related papers (2025-03-17T14:48:57Z)
Nested Annealed Training Scheme for Generative Adversarial Networks [54.70743279423088]
This paper focuses on a rigorous mathematical theoretical framework: the composite-functional-gradient GAN (CFG) We reveal the theoretical connection between the CFG model and score-based models. We find that the training objective of the CFG discriminator is equivalent to finding an optimal D(x)
arXiv Detail & Related papers (2025-01-20T07:44:09Z)
A New Formulation of Lipschitz Constrained With Functional Gradient Learning for GANs [52.55025869932486]
This paper introduces a promising alternative method for training Generative Adversarial Networks (GANs) on large-scale datasets with clear theoretical guarantees. We propose a novel Lipschitz-constrained Functional Gradient GANs learning (Li-CFG) method to stabilize the training of GAN. We demonstrate that the neighborhood size of the latent vector can be reduced by increasing the norm of the discriminator gradient.
arXiv Detail & Related papers (2025-01-20T02:48:07Z)
Contrastive CFG: Improving CFG in Diffusion Models by Contrasting Positive and Negative Concepts [55.298031232672734]
As-Free Guidance (CFG) has proven effective in conditional diffusion model sampling for improved condition alignment. We present a novel method to enhance negative CFG guidance using contrastive loss.
arXiv Detail & Related papers (2024-11-26T03:29:27Z)
Supervised Contrastive Learning with Heterogeneous Similarity for Distribution Shifts [3.7819322027528113]
We propose a new regularization using the supervised contrastive learning to prevent such overfitting and to train models that do not degrade their performance under the distribution shifts. Experiments on benchmark datasets that emulate distribution shifts, including subpopulation shift and domain generalization, demonstrate the advantage of the proposed method.
arXiv Detail & Related papers (2023-04-07T01:45:09Z)
MonoFlow: Rethinking Divergence GANs via the Perspective of Wasserstein Gradient Flows [34.795115757545915]
We introduce a unified generative modeling framework - MonoFlow. Under our framework, adversarial training can be viewed as a procedure first obtaining MonoFlow's vector field. We also reveal the fundamental difference between variational divergence minimization and adversarial training.
arXiv Detail & Related papers (2023-02-02T13:05:27Z)
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution. We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers trained via Conditional GANs [9.496524884855559]
We present an approach to quantifying uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs) Instead of shielding the entire in-distribution data with GAN generated OoD examples, we shield each class separately with out-of-class examples generated by a conditional GAN. In particular, we improve over the OoD detection and FP detection performance of state-of-the-art GAN-training based classifiers.
arXiv Detail & Related papers (2022-01-31T14:42:35Z)
Understanding Why Generalized Reweighting Does Not Improve Over ERM [36.69039005731499]
Empirical risk minimization (ERM) is known in practice to be non-robust to distributional shift where the training and the test distributions are different. A suite of approaches, such as importance weighting, and variants of distributionally robust optimization (DRO) have been proposed to solve this problem. But a line of recent work has empirically shown that these approaches do not significantly improve over ERM in real applications with distribution shift.
arXiv Detail & Related papers (2022-01-28T17:58:38Z)
GANs with Variational Entropy Regularizers: Applications in Mitigating the Mode-Collapse Issue [95.23775347605923]
Building on the success of deep learning, Generative Adversarial Networks (GANs) provide a modern approach to learn a probability distribution from observed samples. GANs often suffer from the mode collapse issue where the generator fails to capture all existing modes of the input distribution. We take an information-theoretic approach and maximize a variational lower bound on the entropy of the generated samples to increase their diversity.
arXiv Detail & Related papers (2020-09-24T19:34:37Z)
Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels. We show that the quality of gradient estimation matters more in risk minimization. We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z)
When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs) In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability. Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks [65.24701908364383]
We show that a sufficient condition for a uncertainty on a ReLU network is "to be a bit Bayesian calibrated" We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.
arXiv Detail & Related papers (2020-02-24T08:52:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.