ContraFeat: Contrasting Deep Features for Semantic Discovery
- URL: http://arxiv.org/abs/2212.07277v1
- Date: Wed, 14 Dec 2022 15:22:13 GMT
- Title: ContraFeat: Contrasting Deep Features for Semantic Discovery
- Authors: Xinqi Zhu, Chang Xu, Dacheng Tao
- Abstract summary: StyleGAN has shown strong potential for disentangled semantic control.
Existing semantic discovery methods on StyleGAN rely on manual selection of modified latent layers to obtain satisfactory manipulation results.
We propose a model that automates this process and achieves state-of-the-art semantic discovery performance.
- Score: 102.4163768995288
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: StyleGAN has shown strong potential for disentangled semantic control, thanks
to its special design of multi-layer intermediate latent variables. However,
existing semantic discovery methods on StyleGAN rely on manual selection of
modified latent layers to obtain satisfactory manipulation results, which is
tedious and demanding. In this paper, we propose a model that automates this
process and achieves state-of-the-art semantic discovery performance. The model
consists of an attention-equipped navigator module and losses contrasting
deep-feature changes. We propose two model variants, with one contrasting
samples in a binary manner, and another one contrasting samples with learned
prototype variation patterns. The proposed losses are defined with pretrained
deep features, based on our assumption that the features can implicitly reveal
the desired semantic structure including consistency and orthogonality.
Additionally, we design two metrics to quantitatively evaluate the performance
of semantic discovery methods on FFHQ dataset, and also show that disentangled
representations can be derived via a simple training process. Experimentally,
our models can obtain state-of-the-art semantic discovery results without
relying on latent layer-wise manual selection, and these discovered semantics
can be used to manipulate real-world images.
Related papers
- D$^4$-VTON: Dynamic Semantics Disentangling for Differential Diffusion based Virtual Try-On [32.73798955587999]
D$4$-VTON is an innovative solution for image-based virtual try-on.
We address challenges from previous studies, such as semantic inconsistencies before and after garment warping.
arXiv Detail & Related papers (2024-07-21T10:40:53Z) - Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - Semantic Prompt for Few-Shot Image Recognition [76.68959583129335]
We propose a novel Semantic Prompt (SP) approach for few-shot learning.
The proposed approach achieves promising results, improving the 1-shot learning accuracy by 3.67% on average.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - Dual Path Modeling for Semantic Matching by Perceiving Subtle Conflicts [14.563722352134949]
Transformer-based pre-trained models have achieved great improvements in semantic matching.
Existing models still suffer from insufficient ability to capture subtle differences.
We propose a novel Dual Path Modeling Framework to enhance the model's ability to perceive subtle differences.
arXiv Detail & Related papers (2023-02-24T09:29:55Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - Gradient-Based Adversarial and Out-of-Distribution Detection [15.510581400494207]
We introduce confounding labels in gradient generation to probe the effective expressivity of neural networks.
We show that our gradient-based approach allows for capturing the anomaly in inputs based on the effective expressivity of the models.
arXiv Detail & Related papers (2022-06-16T15:50:41Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.