MODIFY: Model-driven Face Stylization without Style Images
- URL: http://arxiv.org/abs/2303.09831v1
- Date: Fri, 17 Mar 2023 08:35:17 GMT
- Title: MODIFY: Model-driven Face Stylization without Style Images
- Authors: Yuhe Ding, Jian Liang, Jie Cao, Aihua Zheng, Ran He
- Abstract summary: Existing face stylization methods always acquire the presence of the target (style) domain during the translation process.
We propose a new method called MODel-drIven Face stYlization (MODIFY), which relies on the generative model to bypass the dependence of the target images.
Experimental results on several different datasets validate the effectiveness of MODIFY for unsupervised face stylization.
- Score: 77.24793103549158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing face stylization methods always acquire the presence of the target
(style) domain during the translation process, which violates privacy
regulations and limits their applicability in real-world systems. To address
this issue, we propose a new method called MODel-drIven Face stYlization
(MODIFY), which relies on the generative model to bypass the dependence of the
target images. Briefly, MODIFY first trains a generative model in the target
domain and then translates a source input to the target domain via the provided
style model. To preserve the multimodal style information, MODIFY further
introduces an additional remapping network, mapping a known continuous
distribution into the encoder's embedding space. During translation in the
source domain, MODIFY fine-tunes the encoder module within the target
style-persevering model to capture the content of the source input as precisely
as possible. Our method is extremely simple and satisfies versatile training
modes for face stylization. Experimental results on several different datasets
validate the effectiveness of MODIFY for unsupervised face stylization.
Related papers
- StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization [85.18995948334592]
Single domain generalization (single DG) aims at learning a robust model generalizable to unseen domains from only one training domain.
State-of-the-art approaches have mostly relied on data augmentations, such as adversarial perturbation and style enhancement, to synthesize new data.
We propose emphStyDeSty, which explicitly accounts for the alignment of the source and pseudo domains in the process of data augmentation.
arXiv Detail & Related papers (2024-06-01T02:41:34Z) - ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model [73.95608242322949]
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images.
We present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion to address challenges such as misinterpreted styles and inconsistent semantics.
arXiv Detail & Related papers (2024-05-24T07:19:40Z) - DPStyler: Dynamic PromptStyler for Source-Free Domain Generalization [43.67213274161226]
Source-Free Domain Generalization (SFDG) aims to develop a model that works for unseen target domains without relying on any source domain.
Research in SFDG primarily bulids upon the existing knowledge of large-scale vision-language models.
We introduce Dynamic PromptStyler (DPStyler), comprising Style Generation and Style Removal modules.
arXiv Detail & Related papers (2024-03-25T12:31:01Z) - HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced
Diffusion Models [84.12784265734238]
The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video.
We propose HiCAST, which is capable of explicitly customizing the stylization results according to various source of semantic clues.
A novel learning objective is leveraged for video diffusion model training, which significantly improve cross-frame temporal consistency.
arXiv Detail & Related papers (2024-01-11T12:26:23Z) - One-shot Unsupervised Domain Adaptation with Personalized Diffusion
Models [15.590759602379517]
Adapting a segmentation model from a labeled source domain to a target domain is one of the most challenging problems in domain adaptation.
We leverage text-to-image diffusion models to generate a synthetic target dataset with photo-realistic images.
Experiments show that our method surpasses the state-of-the-art OSUDA methods by up to +7.1%.
arXiv Detail & Related papers (2023-03-31T14:16:38Z) - Adversarial Style Augmentation for Domain Generalized Urban-Scene
Segmentation [120.96012935286913]
We propose a novel adversarial style augmentation approach, which can generate hard stylized images during training.
Experiments on two synthetic-to-real semantic segmentation benchmarks demonstrate that AdvStyle can significantly improve the model performance on unseen real domains.
arXiv Detail & Related papers (2022-07-11T14:01:25Z) - Towards Controllable and Photorealistic Region-wise Image Manipulation [11.601157452472714]
We present a generative model with auto-encoder architecture for per-region style manipulation.
We apply a code consistency loss to enforce an explicit disentanglement between content and style latent representations.
The model is constrained by a content alignment loss to ensure the foreground editing will not interfere background contents.
arXiv Detail & Related papers (2021-08-19T13:29:45Z) - StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval [119.03470556503942]
Crossmodal matching problem is typically solved by learning a joint embedding space where semantic content shared between photo and sketch modalities are preserved.
An effective model needs to explicitly account for this style diversity, crucially, to unseen user styles.
Our model can not only disentangle the cross-modal shared semantic content, but can adapt the disentanglement to any unseen user style as well, making the model truly agnostic.
arXiv Detail & Related papers (2021-03-29T15:44:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.