StyleFlow For Content-Fixed Image to Image Translation
- URL: http://arxiv.org/abs/2207.01909v1
- Date: Tue, 5 Jul 2022 09:40:03 GMT
- Title: StyleFlow For Content-Fixed Image to Image Translation
- Authors: Weichen Fan, Jinghuan Chen, Jiabin Ma, Jun Hou, Shuai Yi
- Abstract summary: StyleFlow is a new I2I translation model that consists of normalizing flows and a novel Style-Aware Normalization (SAN) module.
Our model supports both image-guided translation and multi-modal synthesis.
- Score: 15.441136520005578
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Image-to-image (I2I) translation is a challenging topic in computer vision.
We divide this problem into three tasks: strongly constrained translation,
normally constrained translation, and weakly constrained translation. The
constraint here indicates the extent to which the content or semantic
information in the original image is preserved. Although previous approaches
have achieved good performance in weakly constrained tasks, they failed to
fully preserve the content in both strongly and normally constrained tasks,
including photo-realism synthesis, style transfer, and colorization, etc. To
achieve content-preserving transfer in strongly constrained and normally
constrained tasks, we propose StyleFlow, a new I2I translation model that
consists of normalizing flows and a novel Style-Aware Normalization (SAN)
module. With the invertible network structure, StyleFlow first projects input
images into deep feature space in the forward pass, while the backward pass
utilizes the SAN module to perform content-fixed feature transformation and
then projects back to image space. Our model supports both image-guided
translation and multi-modal synthesis. We evaluate our model in several I2I
translation benchmarks, and the results show that the proposed model has
advantages over previous methods in both strongly constrained and normally
constrained tasks.
Related papers
- Unpaired Image-to-Image Translation with Content Preserving Perspective: A Review [1.1243043117244755]
Image-to-image translation (I2I) transforms an image from a source domain to a target domain while preserving source content.
The degree of preservation of the content of the source images in the translation process can be different according to the problem and the intended application.
We divide the different tasks in the field of image-to-image translation into three categories: Fully Content preserving, Partially Content preserving, and Non-Content preserving.
arXiv Detail & Related papers (2025-02-11T20:09:29Z) - Ensuring Consistency for In-Image Translation [47.1986912570945]
The in-image machine translation task involves translating text embedded within images, with the translated results presented in image format.
We propose the need to uphold two types of consistency in this task: translation consistency and image generation consistency.
We introduce a novel two-stage framework named HCIIT which involves text-image translation using a multimodal multilingual large language model in the first stage and image backfilling with a diffusion model in the second stage.
arXiv Detail & Related papers (2024-12-24T03:50:03Z) - Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation [81.45400849638347]
In-image machine translation (IIMT) aims to translate an image containing texts in source language into an image containing translations in target language.
In this paper, we propose an end-to-end IIMT model consisting of four modules.
Our model achieves competitive performance compared to cascaded models with only 70.9% of parameters, and significantly outperforms the pixel-level end-to-end IIMT model.
arXiv Detail & Related papers (2024-07-03T08:15:39Z) - AnyTrans: Translate AnyText in the Image with Large Scale Models [88.5887934499388]
This paper introduces AnyTrans, an all-encompassing framework for the task-Translate AnyText in the Image (TATI)
Our framework incorporates contextual cues from both textual and visual elements during translation.
We have meticulously compiled a test dataset called MTIT6, which consists of multilingual text image translation data from six language pairs.
arXiv Detail & Related papers (2024-06-17T11:37:48Z) - Hierarchy Flow For High-Fidelity Image-to-Image Translation [38.87847690777645]
We propose a novel flow-based model to achieve better content preservation during translation.
Our approach achieves state-of-the-art performance, with convincing advantages in both strong- and normal-fidelity tasks.
arXiv Detail & Related papers (2023-08-14T03:11:17Z) - Unsupervised Image-to-Image Translation with Generative Prior [103.54337984566877]
Unsupervised image-to-image translation aims to learn the translation between two visual domains without paired data.
We present a novel framework, Generative Prior-guided UN Image-to-image Translation (GP-UNIT), to improve the overall quality and applicability of the translation algorithm.
arXiv Detail & Related papers (2022-04-07T17:59:23Z) - Unbalanced Feature Transport for Exemplar-based Image Translation [51.54421432912801]
This paper presents a general image translation framework that incorporates optimal transport for feature alignment between conditional inputs and style exemplars in image translation.
We show that our method achieves superior image translation qualitatively and quantitatively as compared with the state-of-the-art.
arXiv Detail & Related papers (2021-06-19T12:07:48Z) - Unpaired Image-to-Image Translation via Latent Energy Transport [61.62293304236371]
Image-to-image translation aims to preserve source contents while translating to discriminative target styles between two visual domains.
In this paper, we propose to deploy an energy-based model (EBM) in the latent space of a pretrained autoencoder for this task.
Our model is the first to be applicable to 1024$times$1024-resolution unpaired image translation.
arXiv Detail & Related papers (2020-12-01T17:18:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.