Direct Evolutionary Optimization of Variational Autoencoders With Binary
Latents
- URL: http://arxiv.org/abs/2011.13704v2
- Date: Fri, 24 Mar 2023 13:14:14 GMT
- Title: Direct Evolutionary Optimization of Variational Autoencoders With Binary
Latents
- Authors: Enrico Guiraud, Jakob Drefs, J\"org L\"ucke
- Abstract summary: We show that it is possible to train Variational Autoencoders (VAEs) with discrete latents without sampling-based approximation and re parameterization.
In contrast to large supervised networks, the here investigated VAEs can, e.g., denoise a single image without previous training on clean data and/or training on large image datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Discrete latent variables are considered important for real world data, which
has motivated research on Variational Autoencoders (VAEs) with discrete
latents. However, standard VAE training is not possible in this case, which has
motivated different strategies to manipulate discrete distributions in order to
train discrete VAEs similarly to conventional ones. Here we ask if it is also
possible to keep the discrete nature of the latents fully intact by applying a
direct discrete optimization for the encoding model. The approach is
consequently strongly diverting from standard VAE-training by sidestepping
sampling approximation, reparameterization trick and amortization. Discrete
optimization is realized in a variational setting using truncated posteriors in
conjunction with evolutionary algorithms. For VAEs with binary latents, we (A)
show how such a discrete variational method ties into gradient ascent for
network weights, and (B) how the decoder is used to select latent states for
training. Conventional amortized training is more efficient and applicable to
large neural networks. However, using smaller networks, we here find direct
discrete optimization to be efficiently scalable to hundreds of latents. More
importantly, we find the effectiveness of direct optimization to be highly
competitive in `zero-shot' learning. In contrast to large supervised networks,
the here investigated VAEs can, e.g., denoise a single image without previous
training on clean data and/or training on large image datasets. More generally,
the studied approach shows that training of VAEs is indeed possible without
sampling-based approximation and reparameterization, which may be interesting
for the analysis of VAE-training in general. For `zero-shot' settings a direct
optimization, furthermore, makes VAEs competitive where they have previously
been outperformed by non-generative approaches.
Related papers
- Variational Bayes image restoration with compressive autoencoders [4.879530644978008]
Regularization of inverse problems is of paramount importance in computational imaging.
In this work, we first propose to use compressive autoencoders instead of state-of-the-art generative models.
As a second contribution, we introduce the Variational Bayes Latent Estimation (VBLE) algorithm.
arXiv Detail & Related papers (2023-11-29T15:49:31Z) - Lottery Tickets in Evolutionary Optimization: On Sparse
Backpropagation-Free Trainability [0.0]
We study gradient descent (GD)-based sparse training and evolution strategies (ES)
We find that ES explore diverse and flat local optima and do not preserve linear mode connectivity across sparsity levels and independent runs.
arXiv Detail & Related papers (2023-05-31T15:58:54Z) - GFlowNet-EM for learning compositional latent variable models [115.96660869630227]
A key tradeoff in modeling the posteriors over latents is between expressivity and tractable optimization.
We propose the use of GFlowNets, algorithms for sampling from an unnormalized density.
By training GFlowNets to sample from the posterior over latents, we take advantage of their strengths as amortized variational algorithms.
arXiv Detail & Related papers (2023-02-13T18:24:21Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - A Differential Game Theoretic Neural Optimizer for Training Residual
Networks [29.82841891919951]
We propose a generalized Differential Dynamic Programming (DDP) neural architecture that accepts both residual connections and convolution layers.
The resulting optimal control representation admits a gameoretic perspective, in which training residual networks can be interpreted as cooperative trajectory optimization on state-augmented systems.
arXiv Detail & Related papers (2020-07-17T10:19:17Z) - Stochastic Batch Augmentation with An Effective Distilled Dynamic Soft
Label Regularizer [11.153892464618545]
We propose a framework called Batch Augmentation safety of generalization (SBA) to address these problems.
SBA decides whether to augment at iterations controlled by the batch scheduler and in which a ''distilled'' dynamic soft regularization is introduced.
Our experiments on CIFAR-10, CIFAR-100, and ImageNet show that SBA can improve the generalization of the neural networks and speed up the convergence of network training.
arXiv Detail & Related papers (2020-06-27T04:46:39Z) - Dynamic Scale Training for Object Detection [111.33112051962514]
We propose a Dynamic Scale Training paradigm (abbreviated as DST) to mitigate scale variation challenge in object detection.
Experimental results demonstrate the efficacy of our proposed DST towards scale variation handling.
It does not introduce inference overhead and could serve as a free lunch for general detection configurations.
arXiv Detail & Related papers (2020-04-26T16:48:17Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.