StyleGAN of All Trades: Image Manipulation with Only Pretrained StyleGAN
- URL: http://arxiv.org/abs/2111.01619v1
- Date: Tue, 2 Nov 2021 14:31:22 GMT
- Title: StyleGAN of All Trades: Image Manipulation with Only Pretrained StyleGAN
- Authors: Min Jin Chong, Hsin-Ying Lee, David Forsyth
- Abstract summary: We show that with a pretrained StyleGAN along with some operations, we can perform comparably to the state-of-the-art methods on various tasks.
The proposed method is simple, effective, efficient, and applicable to any existing pretrained StyleGAN model.
- Score: 17.93566359555703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, StyleGAN has enabled various image manipulation and editing tasks
thanks to the high-quality generation and the disentangled latent space.
However, additional architectures or task-specific training paradigms are
usually required for different tasks. In this work, we take a deeper look at
the spatial properties of StyleGAN. We show that with a pretrained StyleGAN
along with some operations, without any additional architecture, we can perform
comparably to the state-of-the-art methods on various tasks, including image
blending, panorama generation, generation from a single image, controllable and
local multimodal image to image translation, and attributes transfer. The
proposed method is simple, effective, efficient, and applicable to any existing
pretrained StyleGAN model.
Related papers
- Ada-adapter:Fast Few-shot Style Personlization of Diffusion Model with Pre-trained Image Encoder [57.574544285878794]
Ada-Adapter is a novel framework for few-shot style personalization of diffusion models.
Our method enables efficient zero-shot style transfer utilizing a single reference image.
We demonstrate the effectiveness of our approach on various artistic styles, including flat art, 3D rendering, and logo design.
arXiv Detail & Related papers (2024-07-08T02:00:17Z) - Multimodality-guided Image Style Transfer using Cross-modal GAN
Inversion [42.345533741985626]
We present a novel method to achieve much improved style transfer based on text guidance.
Our method allows style inputs from multiple sources and modalities, enabling MultiModality-guided Image Style Transfer (MMIST)
Specifically, we realize MMIST with a novel cross-modal GAN inversion method, which generates style representations consistent with specified styles.
arXiv Detail & Related papers (2023-12-04T06:38:23Z) - Customize StyleGAN with One Hand Sketch [0.0]
We propose a framework to control StyleGAN imagery with a single user sketch.
We learn a conditional distribution in the latent space of a pre-trained StyleGAN model via energy-based learning.
Our model can generate multi-modal images semantically aligned with the input sketch.
arXiv Detail & Related papers (2023-10-29T09:32:33Z) - Highly Personalized Text Embedding for Image Manipulation by Stable
Diffusion [34.662798793560995]
We present a simple yet highly effective approach to personalization using highly personalized (PerHi) text embedding.
Our method does not require model fine-tuning or identifiers, yet still enables manipulation of background, texture, and motion with just a single image and target text.
arXiv Detail & Related papers (2023-03-15T17:07:45Z) - Federated Domain Generalization for Image Recognition via Cross-Client
Style Transfer [60.70102634957392]
Domain generalization (DG) has been a hot topic in image recognition, with a goal to train a general model that can perform well on unseen domains.
In this paper, we propose a novel domain generalization method for image recognition through cross-client style transfer (CCST) without exchanging data samples.
Our method outperforms recent SOTA DG methods on two DG benchmarks (PACS, OfficeHome) and a large-scale medical image dataset (Camelyon17) in the FL setting.
arXiv Detail & Related papers (2022-10-03T13:15:55Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [63.85888518950824]
We present a text-driven method that allows shifting a generative model to new domains.
We show that through natural language prompts and a few minutes of training, our method can adapt a generator across a multitude of domains.
arXiv Detail & Related papers (2021-08-02T14:46:46Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - TSIT: A Simple and Versatile Framework for Image-to-Image Translation [103.92203013154403]
We introduce a simple and versatile framework for image-to-image translation.
We provide a carefully designed two-stream generative model with newly proposed feature transformations.
This allows multi-scale semantic structure information and style representation to be effectively captured and fused by the network.
A systematic study compares the proposed method with several state-of-the-art task-specific baselines, verifying its effectiveness in both perceptual quality and quantitative evaluations.
arXiv Detail & Related papers (2020-07-23T15:34:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.