Toward a Controllable Disentanglement Network
- URL: http://arxiv.org/abs/2001.08572v3
- Date: Sat, 20 Jun 2020 04:02:22 GMT
- Title: Toward a Controllable Disentanglement Network
- Authors: Zengjie Song, Oluwasanmi Koyejo, Jiangshe Zhang
- Abstract summary: This paper addresses two crucial problems of learning disentangled image representations, namely controlling the degree of disentanglement during image editing, and balancing the disentanglement strength and the reconstruction quality.
By exploring the real-valued space of the soft target representation, we are able to synthesize novel images with the designated properties.
- Score: 22.968760397814993
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses two crucial problems of learning disentangled image
representations, namely controlling the degree of disentanglement during image
editing, and balancing the disentanglement strength and the reconstruction
quality. To encourage disentanglement, we devise a distance covariance based
decorrelation regularization. Further, for the reconstruction step, our model
leverages a soft target representation combined with the latent image code. By
exploring the real-valued space of the soft target representation, we are able
to synthesize novel images with the designated properties. To improve the
perceptual quality of images generated by autoencoder (AE)-based models, we
extend the encoder-decoder architecture with the generative adversarial network
(GAN) by collapsing the AE decoder and the GAN generator into one. We also
design a classification based protocol to quantitatively evaluate the
disentanglement strength of our model. Experimental results showcase the
benefits of the proposed model.
Related papers
- A Compact and Semantic Latent Space for Disentangled and Controllable
Image Editing [4.8201607588546]
We propose an auto-encoder which re-organizes the latent space of StyleGAN, so that each attribute which we wish to edit corresponds to an axis of the new latent space.
We show that our approach has greater disentanglement than competing methods, while maintaining fidelity to the original image with respect to identity.
arXiv Detail & Related papers (2023-12-13T16:18:45Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware
Training [112.96224800952724]
We propose cascaded modulation GAN (CM-GAN) to generate plausible image structures when dealing with large holes in complex images.
In each decoder block, global modulation is first applied to perform coarse semantic-aware synthesis structure, then spatial modulation is applied on the output of global modulation to further adjust the feature map in a spatially adaptive fashion.
In addition, we design an object-aware training scheme to prevent the network from hallucinating new objects inside holes, fulfilling the needs of object removal tasks in real-world scenarios.
arXiv Detail & Related papers (2022-03-22T16:13:27Z) - Toward Interactive Modulation for Photo-Realistic Image Restoration [16.610981587637102]
Modulating image restoration level aims to generate a restored image by altering a factor that represents the restoration strength.
This paper presents a Controllable Unet Generative Adversarial Network (CUGAN) to generate high-frequency textures in the modulation tasks.
arXiv Detail & Related papers (2021-05-07T07:05:56Z) - Towards Unsupervised Deep Image Enhancement with Generative Adversarial
Network [92.01145655155374]
We present an unsupervised image enhancement generative network (UEGAN)
It learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner.
Results show that the proposed model effectively improves the aesthetic quality of images.
arXiv Detail & Related papers (2020-12-30T03:22:46Z) - Improving Augmentation and Evaluation Schemes for Semantic Image
Synthesis [16.097324852253912]
We introduce a novel augmentation scheme designed specifically for generative adversarial networks (GANs)
We propose to randomly warp object shapes in the semantic label maps used as an input to the generator.
The local shape discrepancies between the warped and non-warped label maps and images enable the GAN to learn better the structural and geometric details of the scene.
arXiv Detail & Related papers (2020-11-25T10:55:26Z) - LT-GAN: Self-Supervised GAN with Latent Transformation Detection [10.405721171353195]
We propose a self-supervised approach (LT-GAN) to improve the generation quality and diversity of images.
We experimentally demonstrate that our proposed LT-GAN can be effectively combined with other state-of-the-art training techniques for added benefits.
arXiv Detail & Related papers (2020-10-19T22:09:45Z) - Multi-Scale Boosted Dehazing Network with Dense Feature Fusion [92.92572594942071]
We propose a Multi-Scale Boosted Dehazing Network with Dense Feature Fusion based on the U-Net architecture.
We show that the proposed model performs favorably against the state-of-the-art approaches on the benchmark datasets as well as real-world hazy images.
arXiv Detail & Related papers (2020-04-28T09:34:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.