VCE: Variational Convertor-Encoder for One-Shot Generalization
- URL: http://arxiv.org/abs/2011.06246v1
- Date: Thu, 12 Nov 2020 07:58:14 GMT
- Title: VCE: Variational Convertor-Encoder for One-Shot Generalization
- Authors: Chengshuai Li, Shuai Han, Jianping Xing
- Abstract summary: Variational Convertor-Encoder (VCE) converts an image to various styles.
We present this novel architecture for the problem of one-shot generalization.
We also improve the performance of variational auto-encoder (VAE) to filter those blurred points.
- Score: 3.86981854389977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational Convertor-Encoder (VCE) converts an image to various styles; we
present this novel architecture for the problem of one-shot generalization and
its transfer to new tasks not seen before without additional training. We also
improve the performance of variational auto-encoder (VAE) to filter those
blurred points using a novel algorithm proposed by us, namely large margin VAE
(LMVAE). Two samples with the same property are input to the encoder, and then
a convertor is required to processes one of them from the noisy outputs of the
encoder; finally, the noise represents a variety of transformation rules and is
used to convert new images. The algorithm that combines and improves the
condition variational auto-encoder (CVAE) and introspective VAE, we propose
this new framework aim to transform graphics instead of generating them; it is
used for the one-shot generative process. No sequential inference algorithmic
is needed in training. Compared to recent Omniglot datasets, the results show
that our model produces more realistic and diverse images.
Related papers
- Quantum Down Sampling Filter for Variational Auto-encoder [0.504868948270058]
Variational Autoencoders (VAEs) are essential tools in generative modeling and image reconstruction.
This study aims to improve the quality of reconstructed images by enhancing their resolution and preserving finer details.
We propose a hybrid model that combines quantum computing techniques in the VAE encoder with convolutional neural networks (CNNs) in the decoder.
arXiv Detail & Related papers (2025-01-09T11:08:55Z) - Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference [95.42299246592756]
We study the UNet encoder and empirically analyze the encoder features.
We find that encoder features change minimally, whereas the decoder features exhibit substantial variations across different time-steps.
We validate our approach on other tasks: text-to-video, personalized generation and reference-guided generation.
arXiv Detail & Related papers (2023-12-15T08:46:43Z) - Progressive Learning with Visual Prompt Tuning for Variable-Rate Image
Compression [60.689646881479064]
We propose a progressive learning paradigm for transformer-based variable-rate image compression.
Inspired by visual prompt tuning, we use LPM to extract prompts for input images and hidden features at the encoder side and decoder side, respectively.
Our model outperforms all current variable image methods in terms of rate-distortion performance and approaches the state-of-the-art fixed image compression methods trained from scratch.
arXiv Detail & Related papers (2023-11-23T08:29:32Z) - String-based Molecule Generation via Multi-decoder VAE [56.465033997245776]
We investigate the problem of string-based molecular generation via variational autoencoders (VAEs)
We propose a simple, yet effective idea to improve the performance of VAE for the task.
In our experiments, the proposed VAE model particularly performs well for generating a sample from out-of-domain distribution.
arXiv Detail & Related papers (2022-08-23T03:56:30Z) - Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and
Editability [76.6724135757723]
GAN inversion aims to invert an input image into the latent space of a pre-trained GAN.
Despite the recent advances in GAN inversion, there remain challenges to mitigate the tradeoff between distortion and editability.
We propose a two-step approach that first inverts the input image into a latent code, called pivot code, and then alters the generator so that the input image can be accurately mapped into the pivot code.
arXiv Detail & Related papers (2022-07-19T16:10:16Z) - Neural Data-Dependent Transform for Learned Image Compression [72.86505042102155]
We build a neural data-dependent transform and introduce a continuous online mode decision mechanism to jointly optimize the coding efficiency for each individual image.
The experimental results show the effectiveness of the proposed neural-syntax design and the continuous online mode decision mechanism.
arXiv Detail & Related papers (2022-03-09T14:56:48Z) - Transformer-based Image Compression [18.976159633970177]
Transformer-based Image Compression (TIC) approach is developed which reuses the canonical variational autoencoder (VAE) architecture with paired main and hyper encoder-decoders.
TIC rivals with state-of-the-art approaches including deep convolutional neural networks (CNNs) based learnt image coding (LIC) methods and handcrafted rules-based intra profile of recently-approved Versatile Video Coding (VVC) standard.
arXiv Detail & Related papers (2021-11-12T13:13:20Z) - Learned Multi-Resolution Variable-Rate Image Compression with
Octave-based Residual Blocks [15.308823742699039]
We propose a new variable-rate image compression framework, which employs generalized octave convolutions (GoConv) and generalized octave transposed-convolutions (GoTConv)
To enable a single model to operate with different bit rates and to learn multi-rate image features, a new objective function is introduced.
Experimental results show that the proposed framework trained with variable-rate objective function outperforms the standard codecs such as H.265/HEVC-based BPG and state-of-the-art learning-based variable-rate methods.
arXiv Detail & Related papers (2020-12-31T06:26:56Z) - Simple and Effective VAE Training with Calibrated Decoders [123.08908889310258]
Variational autoencoders (VAEs) provide an effective and simple method for modeling complex distributions.
We study the impact of calibrated decoders, which learn the uncertainty of the decoding distribution.
We propose a simple but novel modification to the commonly used Gaussian decoder, which computes the prediction variance analytically.
arXiv Detail & Related papers (2020-06-23T17:57:47Z) - A Multiparametric Class of Low-complexity Transforms for Image and Video
Coding [0.0]
We introduce a new class of low-complexity 8-point DCT approximations based on a series of works published by Bouguezel, Ahmed and Swamy.
We show that the optimal DCT approximations present compelling results in terms of coding efficiency and image quality metrics.
arXiv Detail & Related papers (2020-06-19T21:56:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.