Style Quantization for Data-Efficient GAN Training
- URL: http://arxiv.org/abs/2503.24282v1
- Date: Mon, 31 Mar 2025 16:28:44 GMT
- Title: Style Quantization for Data-Efficient GAN Training
- Authors: Jian Wang, Xin Lan, Jizhe Zhou, Yuxin Tian, Jiancheng Lv,
- Abstract summary: Under limited data setting, GANs often struggle to navigate and effectively exploit the input latent space.<n>We propose textitSQ-GAN, a novel approach that enhances consistency regularization.<n>Experiments demonstrate significant improvements in both discriminator robustness and generation quality.
- Score: 18.40243591024141
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Under limited data setting, GANs often struggle to navigate and effectively exploit the input latent space. Consequently, images generated from adjacent variables in a sparse input latent space may exhibit significant discrepancies in realism, leading to suboptimal consistency regularization (CR) outcomes. To address this, we propose \textit{SQ-GAN}, a novel approach that enhances CR by introducing a style space quantization scheme. This method transforms the sparse, continuous input latent space into a compact, structured discrete proxy space, allowing each element to correspond to a specific real data point, thereby improving CR performance. Instead of direct quantization, we first map the input latent variables into a less entangled ``style'' space and apply quantization using a learnable codebook. This enables each quantized code to control distinct factors of variation. Additionally, we optimize the optimal transport distance to align the codebook codes with features extracted from the training data by a foundation model, embedding external knowledge into the codebook and establishing a semantically rich vocabulary that properly describes the training dataset. Extensive experiments demonstrate significant improvements in both discriminator robustness and generation quality with our method.
Related papers
- Controlled LLM Decoding via Discrete Auto-regressive Biasing [9.843359827321194]
Controlled text generation allows for enforcing user-defined constraints on large language model outputs.<n>We propose Discrete Auto-regressive Biasing, a controlled decoding algorithm that leverages gradients while operating entirely in the discrete text domain.<n>Our method significantly improves constraint satisfaction while maintaining comparable or better fluency, all with even lower computational costs.
arXiv Detail & Related papers (2025-02-06T00:14:43Z) - Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model [66.91323540178739]
Sequential recommendation (SR) aims to predict items that users may be interested in based on their historical behavior.
We revisit SR from a novel information-theoretic perspective and find that sequential modeling methods fail to adequately capture randomness and unpredictability of user behavior.
Inspired by fuzzy information processing theory, this paper introduces the fuzzy sets of interaction sequences to overcome the limitations and better capture the evolution of users' real interests.
arXiv Detail & Related papers (2024-10-31T14:52:01Z) - Protect Before Generate: Error Correcting Codes within Discrete Deep Generative Models [3.053842954605396]
We introduce a novel method that enhances variational inference in discrete latent variable models.
We leverage Error Correcting Codes (ECCs) to introduce redundancy in the latent representations.
This redundancy is then exploited by the variational posterior to yield more accurate estimates.
arXiv Detail & Related papers (2024-10-10T11:59:58Z) - Structured Probabilistic Coding [28.46046583495838]
This paper presents a new supervised representation learning framework, namely structured probabilistic coding (SPC)
SPC is an encoder-only probabilistic coding technology with a structured regularization from the target space.
It can enhance the generalization ability of pre-trained language models for better language understanding.
arXiv Detail & Related papers (2023-12-21T15:28:02Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Dataset Condensation with Latent Space Knowledge Factorization and
Sharing [73.31614936678571]
We introduce a novel approach for solving dataset condensation problem by exploiting the regularity in a given dataset.
Instead of condensing the dataset directly in the original input space, we assume a generative process of the dataset with a set of learnable codes.
We experimentally show that our method achieves new state-of-the-art records by significant margins on various benchmark datasets.
arXiv Detail & Related papers (2022-08-21T18:14:08Z) - Combining Latent Space and Structured Kernels for Bayesian Optimization
over Combinatorial Spaces [27.989924313988016]
We consider the problem of optimizing spaces (e.g., sequences, trees, and graphs) using expensive black-box function evaluations.
A recent BO approach for spaces is through a reduction to BO over continuous spaces by learning a latent representation of structures.
This paper proposes a principled approach referred as LADDER to overcome this drawback.
arXiv Detail & Related papers (2021-11-01T18:26:22Z) - HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning [74.76431541169342]
Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones.
We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains.
Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
arXiv Detail & Related papers (2021-09-30T14:27:50Z) - Efficient Semantic Image Synthesis via Class-Adaptive Normalization [116.63715955932174]
Class-adaptive normalization (CLADE) is a lightweight but equally-effective variant that is only adaptive to semantic class.
We introduce intra-class positional map encoding calculated from semantic layouts to modulate the normalization parameters of CLADE.
The proposed CLADE can be generalized to different SPADE-based methods while achieving comparable generation quality compared to SPADE.
arXiv Detail & Related papers (2020-12-08T18:59:32Z) - Collaborative Training of GANs in Continuous and Discrete Spaces for
Text Generation [21.435286755934534]
We propose a novel text GAN architecture that promotes the collaborative training of the continuous-space and discrete-space methods.
Our model substantially outperforms state-of-the-art text GANs with respect to quality, diversity, and global consistency.
arXiv Detail & Related papers (2020-10-16T07:51:16Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.