Related papers: ContraFeat: Contrasting Deep Features for Semantic Discovery

ContraFeat: Contrasting Deep Features for Semantic Discovery

URL: http://arxiv.org/abs/2212.07277v1
Date: Wed, 14 Dec 2022 15:22:13 GMT
Title: ContraFeat: Contrasting Deep Features for Semantic Discovery
Authors: Xinqi Zhu, Chang Xu, Dacheng Tao
Abstract summary: StyleGAN has shown strong potential for disentangled semantic control. Existing semantic discovery methods on StyleGAN rely on manual selection of modified latent layers to obtain satisfactory manipulation results. We propose a model that automates this process and achieves state-of-the-art semantic discovery performance.
Score: 102.4163768995288
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: StyleGAN has shown strong potential for disentangled semantic control, thanks to its special design of multi-layer intermediate latent variables. However, existing semantic discovery methods on StyleGAN rely on manual selection of modified latent layers to obtain satisfactory manipulation results, which is tedious and demanding. In this paper, we propose a model that automates this process and achieves state-of-the-art semantic discovery performance. The model consists of an attention-equipped navigator module and losses contrasting deep-feature changes. We propose two model variants, with one contrasting samples in a binary manner, and another one contrasting samples with learned prototype variation patterns. The proposed losses are defined with pretrained deep features, based on our assumption that the features can implicitly reveal the desired semantic structure including consistency and orthogonality. Additionally, we design two metrics to quantitatively evaluate the performance of semantic discovery methods on FFHQ dataset, and also show that disentangled representations can be derived via a simple training process. Experimentally, our models can obtain state-of-the-art semantic discovery results without relying on latent layer-wise manual selection, and these discovered semantics can be used to manipulate real-world images.

Related papers

Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI [59.96044730204345]
We introduce Derivative-Free Diffusion Manifold-Constrainted Gradients (FreeMCG) FreeMCG serves as an improved basis for explainability of a given neural network. We show that our method yields state-of-the-art results while preserving the essential properties expected of XAI tools.
arXiv Detail & Related papers (2024-11-22T11:15:14Z)
D$^4$-VTON: Dynamic Semantics Disentangling for Differential Diffusion based Virtual Try-On [32.73798955587999]
D$4$-VTON is an innovative solution for image-based virtual try-on. We address challenges from previous studies, such as semantic inconsistencies before and after garment warping.
arXiv Detail & Related papers (2024-07-21T10:40:53Z)
Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks. We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception. Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z)
Dual Path Modeling for Semantic Matching by Perceiving Subtle Conflicts [14.563722352134949]
Transformer-based pre-trained models have achieved great improvements in semantic matching. Existing models still suffer from insufficient ability to capture subtle differences. We propose a novel Dual Path Modeling Framework to enhance the model's ability to perceive subtle differences.
arXiv Detail & Related papers (2023-02-24T09:29:55Z)
GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes. It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes. We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z)
Semantic Image Synthesis via Diffusion Models [174.24523061460704]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks. Recent work on semantic image synthesis mainly follows the de facto GAN-based approaches. We propose a novel framework based on DDPM for semantic image synthesis.
arXiv Detail & Related papers (2022-06-30T18:31:51Z)
Gradient-Based Adversarial and Out-of-Distribution Detection [15.510581400494207]
We introduce confounding labels in gradient generation to probe the effective expressivity of neural networks. We show that our gradient-based approach allows for capturing the anomaly in inputs based on the effective expressivity of the models.
arXiv Detail & Related papers (2022-06-16T15:50:41Z)
Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework. Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z)
Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.