Simpler is better: spectral regularization and up-sampling techniques
for variational autoencoders
- URL: http://arxiv.org/abs/2201.07544v1
- Date: Wed, 19 Jan 2022 11:49:57 GMT
- Title: Simpler is better: spectral regularization and up-sampling techniques
for variational autoencoders
- Authors: Sara Bj\"ork, Jonas Nordhaug Myhre and Thomas Haugland Johansen
- Abstract summary: characterization of the spectral behavior of generative models based on neural networks remains an open issue.
Recent research has focused heavily on generative adversarial networks and the high-frequency discrepancies between real and generated images.
We propose a simple 2D Fourier transform-based spectral regularization loss for the Variational Autoencoders (VAEs)
- Score: 1.2234742322758418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Full characterization of the spectral behavior of generative models based on
neural networks remains an open issue. Recent research has focused heavily on
generative adversarial networks and the high-frequency discrepancies between
real and generated images. The current solution to avoid this is to either
replace transposed convolutions with bilinear up-sampling or add a spectral
regularization term in the generator. It is well known that Variational
Autoencoders (VAEs) also suffer from these issues. In this work, we propose a
simple 2D Fourier transform-based spectral regularization loss for the VAE and
show that it can achieve results equal to, or better than, the current
state-of-the-art in frequency-aware losses for generative models. In addition,
we experiment with altering the up-sampling procedure in the generator network
and investigate how it influences the spectral performance of the model. We
include experiments on synthetic and real data sets to demonstrate our results.
Related papers
- Frequency-Aware Deepfake Detection: Improving Generalizability through
Frequency Space Learning [81.98675881423131]
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images.
Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries.
We introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors.
arXiv Detail & Related papers (2024-03-12T01:28:00Z) - Spectrum Translation for Refinement of Image Generation (STIG) Based on
Contrastive Learning and Spectral Filter Profile [15.5188527312094]
We propose a framework to mitigate the disparity in frequency domain of the generated images.
This is realized by spectrum translation for the refinement of image generation (STIG) based on contrastive learning.
We evaluate our framework across eight fake image datasets and various cutting-edge models to demonstrate the effectiveness of STIG.
arXiv Detail & Related papers (2024-03-08T06:39:24Z) - Graph Generation via Spectral Diffusion [51.60814773299899]
We present GRASP, a novel graph generative model based on 1) the spectral decomposition of the graph Laplacian matrix and 2) a diffusion process.
Specifically, we propose to use a denoising model to sample eigenvectors and eigenvalues from which we can reconstruct the graph Laplacian and adjacency matrix.
Our permutation invariant model can also handle node features by concatenating them to the eigenvectors of each node.
arXiv Detail & Related papers (2024-02-29T09:26:46Z) - Deep Spectral Meshes: Multi-Frequency Facial Mesh Processing with Graph
Neural Networks [1.170907599257096]
spectral meshes are introduced as a method to decompose mesh deformations into low- and high-frequency deformations.
A parametric model for 3D facial mesh synthesis is built upon the proposed framework.
Our model takes further advantage of spectral partitioning by representing different frequency levels with disparate, more suitable representations.
arXiv Detail & Related papers (2024-02-15T23:17:08Z) - Short-Time Fourier Transform for deblurring Variational Autoencoders [0.0]
Variational Autoencoders (VAEs) are powerful generative models.
Their generated samples are known to suffer from a characteristic blurriness, as compared to the outputs of alternative generating techniques.
arXiv Detail & Related papers (2024-01-06T08:57:11Z) - Degradation-Noise-Aware Deep Unfolding Transformer for Hyperspectral
Image Denoising [9.119226249676501]
Hyperspectral images (HSIs) are often quite noisy because of narrow band spectral filtering.
To reduce the noise in HSI data cubes, both model-driven and learning-based denoising algorithms have been proposed.
This paper proposes a Degradation-Noise-Aware Unfolding Network (DNA-Net) that addresses these issues.
arXiv Detail & Related papers (2023-05-06T13:28:20Z) - LD-GAN: Low-Dimensional Generative Adversarial Network for Spectral
Image Generation with Variance Regularization [72.4394510913927]
Deep learning methods are state-of-the-art for spectral image (SI) computational tasks.
GANs enable diverse augmentation by learning and sampling from the data distribution.
GAN-based SI generation is challenging since the high-dimensionality nature of this kind of data hinders the convergence of the GAN training yielding to suboptimal generation.
We propose a statistical regularization to control the low-dimensional representation variance for the autoencoder training and to achieve high diversity of samples generated with the GAN.
arXiv Detail & Related papers (2023-04-29T00:25:02Z) - SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
Adaptive Noise Spectral Shaping [51.698273019061645]
SpecGrad adapts the diffusion noise so that its time-varying spectral envelope becomes close to the conditioning log-mel spectrogram.
It is processed in the time-frequency domain to keep the computational cost almost the same as the conventional DDPM-based neural vocoders.
arXiv Detail & Related papers (2022-03-31T02:08:27Z) - Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST)
CST embedding HSI sparsity into deep learning for HSI reconstruction.
In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z) - Conditioning Trick for Training Stable GANs [70.15099665710336]
We propose a conditioning trick, called difference departure from normality, applied on the generator network in response to instability issues during GAN training.
We force the generator to get closer to the departure from normality function of real samples computed in the spectral domain of Schur decomposition.
arXiv Detail & Related papers (2020-10-12T16:50:22Z) - Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are
Failing to Reproduce Spectral Distributions [13.439086686599891]
We show that up-convolution or transposed convolution, are causing the inability of such models to reproduce spectral distributions of natural training data correctly.
We propose to add a novel spectral regularization term to the training optimization objective.
We show that this approach not only allows to train spectral consistent GANs that are avoiding high frequency errors.
arXiv Detail & Related papers (2020-03-03T23:04:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.