Distribution Aligned Multimodal and Multi-Domain Image Stylization
- URL: http://arxiv.org/abs/2006.01431v1
- Date: Tue, 2 Jun 2020 07:25:53 GMT
- Title: Distribution Aligned Multimodal and Multi-Domain Image Stylization
- Authors: Minxuan Lin, Fan Tang, Weiming Dong, Xiao Li, Chongyang Ma, Changsheng
Xu
- Abstract summary: We propose a unified framework for multimodal and multi-domain style transfer.
The key component of our method is a novel style distribution alignment module.
We validate our proposed framework on painting style transfer with a variety of different artistic styles and genres.
- Score: 76.74823384524814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal and multi-domain stylization are two important problems in the
field of image style transfer. Currently, there are few methods that can
perform both multimodal and multi-domain stylization simultaneously. In this
paper, we propose a unified framework for multimodal and multi-domain style
transfer with the support of both exemplar-based reference and randomly sampled
guidance. The key component of our method is a novel style distribution
alignment module that eliminates the explicit distribution gaps between various
style domains and reduces the risk of mode collapse. The multimodal diversity
is ensured by either guidance from multiple images or random style code, while
the multi-domain controllability is directly achieved by using a domain label.
We validate our proposed framework on painting style transfer with a variety of
different artistic styles and genres. Qualitative and quantitative comparisons
with state-of-the-art methods demonstrate that our method can generate
high-quality results of multi-domain styles and multimodal instances with
reference style guidance or random sampled style.
Related papers
- MuVieCAST: Multi-View Consistent Artistic Style Transfer [6.767885381740952]
We introduce MuVieCAST, a modular multi-view consistent style transfer network architecture.
MuVieCAST supports both sparse and dense views, making it versatile enough to handle a wide range of multi-view image datasets.
arXiv Detail & Related papers (2023-12-08T14:01:03Z) - Generative Powers of Ten [60.6740997942711]
We present a method that uses a text-to-image model to generate consistent content across multiple image scales.
We achieve this through a joint multi-scale diffusion sampling approach.
Our method enables deeper levels of zoom than traditional super-resolution methods.
arXiv Detail & Related papers (2023-12-04T18:59:25Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - Separating Content and Style for Unsupervised Image-to-Image Translation [20.44733685446886]
Unsupervised image-to-image translation aims to learn the mapping between two visual domains with unpaired samples.
We propose to separate the content code and style code simultaneously in a unified framework.
Based on the correlation between the latent features and the high-level domain-invariant tasks, the proposed framework demonstrates superior performance.
arXiv Detail & Related papers (2021-10-27T12:56:50Z) - Anisotropic Stroke Control for Multiple Artists Style Transfer [36.92721585146738]
Stroke Control Multi-Artist Style Transfer framework is developed.
Anisotropic Stroke Module (ASM) endows the network with the ability of adaptive semantic-consistency among various styles.
In contrast to the single-scale conditional discriminator, our discriminator is able to capture multi-scale texture clue.
arXiv Detail & Related papers (2020-10-16T05:32:26Z) - TSIT: A Simple and Versatile Framework for Image-to-Image Translation [103.92203013154403]
We introduce a simple and versatile framework for image-to-image translation.
We provide a carefully designed two-stream generative model with newly proposed feature transformations.
This allows multi-scale semantic structure information and style representation to be effectively captured and fused by the network.
A systematic study compares the proposed method with several state-of-the-art task-specific baselines, verifying its effectiveness in both perceptual quality and quantitative evaluations.
arXiv Detail & Related papers (2020-07-23T15:34:06Z) - Manifold Alignment for Semantically Aligned Style Transfer [61.1274057338588]
We make a new assumption that image features from the same semantic region form a manifold and an image with multiple semantic regions follows a multi-manifold distribution.
Based on this assumption, the style transfer problem is formulated as aligning two multi-manifold distributions.
The proposed framework allows semantically similar regions between the output and the style image share similar style patterns.
arXiv Detail & Related papers (2020-05-21T16:52:37Z) - Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings [76.85673049332428]
Learned joint representations of images and text form the backbone of several important cross-domain tasks such as image captioning.
We propose a novel semi-supervised framework, which models shared information between domains and domain-specific information separately.
We demonstrate the effectiveness of our model on diverse tasks, including image captioning and text-to-image synthesis.
arXiv Detail & Related papers (2020-02-16T19:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.