GP-UNIT: Generative Prior for Versatile Unsupervised Image-to-Image
Translation
- URL: http://arxiv.org/abs/2306.04636v1
- Date: Wed, 7 Jun 2023 17:59:22 GMT
- Title: GP-UNIT: Generative Prior for Versatile Unsupervised Image-to-Image
Translation
- Authors: Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy
- Abstract summary: We introduce a novel versatile framework, Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT)
GP-UNIT is able to perform valid translations between both close domains and distant domains.
We validate the superiority of GP-UNIT over state-of-the-art translation models in robust, high-quality and diversified translations.
- Score: 103.54337984566877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in deep learning have witnessed many successful unsupervised
image-to-image translation models that learn correspondences between two visual
domains without paired data. However, it is still a great challenge to build
robust mappings between various domains especially for those with drastic
visual discrepancies. In this paper, we introduce a novel versatile framework,
Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT), that
improves the quality, applicability and controllability of the existing
translation models. The key idea of GP-UNIT is to distill the generative prior
from pre-trained class-conditional GANs to build coarse-level cross-domain
correspondences, and to apply the learned prior to adversarial translations to
excavate fine-level correspondences. With the learned multi-level content
correspondences, GP-UNIT is able to perform valid translations between both
close domains and distant domains. For close domains, GP-UNIT can be
conditioned on a parameter to determine the intensity of the content
correspondences during translation, allowing users to balance between content
and style consistency. For distant domains, semi-supervised learning is
explored to guide GP-UNIT to discover accurate semantic correspondences that
are hard to learn solely from the appearance. We validate the superiority of
GP-UNIT over state-of-the-art translation models in robust, high-quality and
diversified translations between various domains through extensive experiments.
Related papers
- Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning [55.107329995417786]
Large language models (LLMs) have demonstrated impressive general understanding and generation abilities.
We establish a benchmark for multi-domain translation, featuring 25 German$Leftrightarrow$English and 22 Chinese$Leftrightarrow$English test sets.
We propose a domain Chain of Thought (CoT) fine-tuning technique that utilizes the intrinsic multi-domain intelligence of LLMs to improve translation performance.
arXiv Detail & Related papers (2024-10-03T16:15:04Z) - Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos [69.29778009769862]
We introduce LaGTran, a framework that guides robust transfer of discriminative knowledge from labeled source to unlabeled target data with domain gaps.
Motivated by our observation that semantically richer text modality has more favorable transfer properties, we devise a transfer mechanism to use a source-trained text-classifier to generate predictions on the target text descriptions.
Our approach driven by language guidance is surprisingly easy and simple, yet significantly outperforms all prior approaches on challenging datasets like GeoNet and DomainNet.
arXiv Detail & Related papers (2024-03-08T18:58:46Z) - Towards Identifiable Unsupervised Domain Translation: A Diversified
Distribution Matching Approach [14.025593338693698]
Unsupervised domain translation (UDT) aims to find functions that convert samples from one domain to another without changing the high-level semantic meaning.
This study delves into the core identifiability inquiry and introduces an MPA elimination theory.
Our theory leads to a UDT learner using distribution matching over auxiliary variable-induced subsets of the domains.
arXiv Detail & Related papers (2024-01-18T01:07:00Z) - Language-aware Domain Generalization Network for Cross-Scene
Hyperspectral Image Classification [15.842081807249416]
It is necessary to explore the effectiveness of linguistic mode in assisting hyperspectral image classification.
Large-scale pre-training image-text foundation models have demonstrated great performance in a variety of downstream applications.
A Language-aware Domain Generalization Network (LDGnet) is proposed to learn cross-domain invariant representation.
arXiv Detail & Related papers (2022-09-06T10:06:10Z) - Unsupervised Image-to-Image Translation with Generative Prior [103.54337984566877]
Unsupervised image-to-image translation aims to learn the translation between two visual domains without paired data.
We present a novel framework, Generative Prior-guided UN Image-to-image Translation (GP-UNIT), to improve the overall quality and applicability of the translation algorithm.
arXiv Detail & Related papers (2022-04-07T17:59:23Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - Smoothing the Disentangled Latent Style Space for Unsupervised
Image-to-Image Translation [56.55178339375146]
Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic results.
We propose a new training protocol based on three specific losses which help a translation network to learn a smooth and disentangled latent style space.
arXiv Detail & Related papers (2021-06-16T17:58:21Z) - Rethinking the Truly Unsupervised Image-to-Image Translation [29.98784909971291]
Unsupervised image-to-image translation model (TUNIT) learns to separate image domains and translates input images into estimated domains.
Experimental results show TUNIT achieves comparable or even better performance than the set-level supervised model trained with full labels.
TUNIT can be easily extended to semi-supervised learning with a few labeled data.
arXiv Detail & Related papers (2020-06-11T15:15:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.