Contrastive Learning for Diverse Disentangled Foreground Generation
- URL: http://arxiv.org/abs/2211.02707v1
- Date: Fri, 4 Nov 2022 18:51:04 GMT
- Title: Contrastive Learning for Diverse Disentangled Foreground Generation
- Authors: Yuheng Li, Yijun Li, Jingwan Lu, Eli Shechtman, Yong Jae Lee, Krishna
Kumar Singh
- Abstract summary: We introduce a new method for diverse foreground generation with explicit control over various factors.
We leverage contrastive learning with latent codes to generate diverse foreground results for the same masked input.
Experiments demonstrate the superiority of our method over state-of-the-arts in result diversity and generation controllability.
- Score: 67.81298739373766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a new method for diverse foreground generation with explicit
control over various factors. Existing image inpainting based foreground
generation methods often struggle to generate diverse results and rarely allow
users to explicitly control specific factors of variation (e.g., varying the
facial identity or expression for face inpainting results). We leverage
contrastive learning with latent codes to generate diverse foreground results
for the same masked input. Specifically, we define two sets of latent codes,
where one controls a pre-defined factor (``known''), and the other controls the
remaining factors (``unknown''). The sampled latent codes from the two sets
jointly bi-modulate the convolution kernels to guide the generator to
synthesize diverse results. Experiments demonstrate the superiority of our
method over state-of-the-arts in result diversity and generation
controllability.
Related papers
- COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection [56.7599217711363]
Face forgery recognition methods can only process one face at a time.
Most face forgery recognition methods can only process one face at a time.
We propose COMICS, an end-to-end framework for multi-face forgery detection.
arXiv Detail & Related papers (2023-08-03T03:37:13Z) - Factor Decomposed Generative Adversarial Networks for Text-to-Image
Synthesis [7.658760090153791]
We propose Factor Decomposed Generative Adversa Networks(FDGAN)
We firstly generate images from the noise vector and then apply the sentence embedding in the normalization layer for both generator and discriminators.
The experimental results show that decomposing the noise and the sentence embedding can disentangle latent factors in text-to-image synthesis.
arXiv Detail & Related papers (2023-03-24T05:57:53Z) - Identifiability Results for Multimodal Contrastive Learning [72.15237484019174]
We show that it is possible to recover shared factors in a more general setup than the multi-view setting studied previously.
Our work provides a theoretical basis for multimodal representation learning and explains in which settings multimodal contrastive learning can be effective in practice.
arXiv Detail & Related papers (2023-03-16T09:14:26Z) - DivCo: Diverse Conditional Image Synthesis via Contrastive Generative
Adversarial Network [70.12848483302915]
Conditional generative adversarial networks (cGANs) target at diverse images given the input conditions and latent codes.
Recent MSGAN tried to encourage the diversity of the generated image but only considers "negative" relations between the image pairs.
We propose a novel DivCo framework to properly constrain both "positive" and "negative" relations between the generated images specified in the latent space.
arXiv Detail & Related papers (2021-03-14T11:11:15Z) - RoutingGAN: Routing Age Progression and Regression with Disentangled
Learning [20.579282497730944]
This paper introduces a dropout-like method based on GAN(RoutingGAN) to route different effects in a high-level semantic feature space.
We first disentangle the age-invariant features from the input face, and then gradually add the effects to the features by residual routers.
Experimental results on two benchmarked datasets demonstrate superior performance over existing methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-02-01T02:57:32Z) - Composed Variational Natural Language Generation for Few-shot Intents [118.37774762596123]
We generate training examples for few-shot intents in the realistic imbalanced scenario.
To evaluate the quality of the generated utterances, experiments are conducted on the generalized few-shot intent detection task.
Our proposed model achieves state-of-the-art performances on two real-world intent detection datasets.
arXiv Detail & Related papers (2020-09-21T17:48:43Z) - Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem.
We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.
Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.