ControlVAE: Controllable Variational Autoencoder
- URL: http://arxiv.org/abs/2004.05988v5
- Date: Sat, 20 Jun 2020 20:21:48 GMT
- Title: ControlVAE: Controllable Variational Autoencoder
- Authors: Huajie Shao, Shuochao Yao, Dachun Sun, Aston Zhang, Shengzhong Liu,
Dongxin Liu, Jun Wang, Tarek Abdelzaher
- Abstract summary: Variational Autoencoders (VAE) have been widely used in a variety of applications, such as dialog generation, image generation and disentangled representation learning.
ControlVAE combines a controller, inspired by automatic control theory, with the basic VAE to improve the performance of resulting generative models.
- Score: 16.83870832766681
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational Autoencoders (VAE) and their variants have been widely used in a
variety of applications, such as dialog generation, image generation and
disentangled representation learning. However, the existing VAE models have
some limitations in different applications. For example, a VAE easily suffers
from KL vanishing in language modeling and low reconstruction quality for
disentangling. To address these issues, we propose a novel controllable
variational autoencoder framework, ControlVAE, that combines a controller,
inspired by automatic control theory, with the basic VAE to improve the
performance of resulting generative models. Specifically, we design a new
non-linear PI controller, a variant of the proportional-integral-derivative
(PID) control, to automatically tune the hyperparameter (weight) added in the
VAE objective using the output KL-divergence as feedback during model training.
The framework is evaluated using three applications; namely, language modeling,
disentangled representation learning, and image generation. The results show
that ControlVAE can achieve better disentangling and reconstruction quality
than the existing methods. For language modelling, it not only averts the
KL-vanishing, but also improves the diversity of generated text. Finally, we
also demonstrate that ControlVAE improves the reconstruction quality of
generated images compared to the original VAE.
Related papers
- Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - CAR: Controllable Autoregressive Modeling for Visual Generation [100.33455832783416]
Controllable AutoRegressive Modeling (CAR) is a novel, plug-and-play framework that integrates conditional control into multi-scale latent variable modeling.
CAR progressively refines and captures control representations, which are injected into each autoregressive step of the pre-trained model to guide the generation process.
Our approach demonstrates excellent controllability across various types of conditions and delivers higher image quality compared to previous methods.
arXiv Detail & Related papers (2024-10-07T00:55:42Z) - ControlVAR: Exploring Controllable Visual Autoregressive Modeling [48.66209303617063]
Conditional visual generation has witnessed remarkable progress with the advent of diffusion models (DMs)
Challenges such as expensive computational cost, high inference latency, and difficulties of integration with large language models (LLMs) have necessitated exploring alternatives to DMs.
This paper introduces Controlmore, a novel framework that explores pixel-level controls in visual autoregressive modeling for flexible and efficient conditional generation.
arXiv Detail & Related papers (2024-06-14T06:35:33Z) - LlaMaVAE: Guiding Large Language Model Generation via Continuous Latent
Sentence Spaces [1.529963465178546]
We present LlaMaVAE, which combines expressive encoder and decoder models (sentenceT5 and LlaMA) with a VAE architecture to provide better text generation control to large language models (LLMs)
Experimental results reveal that LlaMaVAE can outperform the previous state-of-the-art VAE language model, Optimus, across various tasks.
arXiv Detail & Related papers (2023-12-20T17:25:23Z) - Composing Ensembles of Pre-trained Models via Iterative Consensus [95.10641301155232]
We propose a unified framework for composing ensembles of different pre-trained models.
We use pre-trained models as "generators" or "scorers" and compose them via closed-loop iterative consensus optimization.
We demonstrate that consensus achieved by an ensemble of scorers outperforms the feedback of a single scorer.
arXiv Detail & Related papers (2022-10-20T18:46:31Z) - Multimodal VAE Active Inference Controller [0.0]
We present a novel active inference torque controller for industrial arms.
We include multimodal state representation learning using a linearly coupled multimodal variational autoencoder.
Results showed improved tracking and control in goal-directed reaching due to the increased representation power.
arXiv Detail & Related papers (2021-03-07T18:00:27Z) - Transformer-based Conditional Variational Autoencoder for Controllable
Story Generation [39.577220559911055]
We investigate large-scale latent variable models (LVMs) for neural story generation with objectives in two threads: generation effectiveness and controllability.
We advocate to revive latent variable modeling, essentially the power of representation learning, in the era of Transformers.
Specifically, we integrate latent representation vectors with a Transformer-based pre-trained architecture to build conditional variational autoencoder (CVAE)
arXiv Detail & Related papers (2021-01-04T08:31:11Z) - ControlVAE: Tuning, Analytical Properties, and Performance Analysis [14.272917020105147]
ControlVAE is a new variational autoencoder framework.
It stabilizes the KL-divergence of VAE models to a specified value.
It can achieve a good trade-off between reconstruction quality and KL-divergence.
arXiv Detail & Related papers (2020-10-31T12:32:39Z) - Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem.
We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.
Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z) - Simple and Effective VAE Training with Calibrated Decoders [123.08908889310258]
Variational autoencoders (VAEs) provide an effective and simple method for modeling complex distributions.
We study the impact of calibrated decoders, which learn the uncertainty of the decoding distribution.
We propose a simple but novel modification to the commonly used Gaussian decoder, which computes the prediction variance analytically.
arXiv Detail & Related papers (2020-06-23T17:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.