Evaluation of Correctness in Unsupervised Many-to-Many Image Translation
- URL: http://arxiv.org/abs/2103.15727v1
- Date: Mon, 29 Mar 2021 16:13:03 GMT
- Title: Evaluation of Correctness in Unsupervised Many-to-Many Image Translation
- Authors: Dina Bashkirova, Ben Usman and Kate Saenko
- Abstract summary: Unsupervised many-to-many image-to-image (UMMI2I) translation methods seek to generate a plausible example from the target domain.
We propose a set of benchmarks and metrics for the evaluation of semantic correctness of UMMI2I methods.
- Score: 61.44666983942965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given an input image from a source domain and a "guidance" image from a
target domain, unsupervised many-to-many image-to-image (UMMI2I) translation
methods seek to generate a plausible example from the target domain that
preserves domain-invariant information of the input source image and inherits
the domain-specific information from the guidance image. For example, when
translating female faces to male faces, the generated male face should have the
same expression, pose and hair color as the input female image, and the same
facial hairstyle and other male-specific attributes as the guidance male image.
Current state-of-the art UMMI2I methods generate visually pleasing images, but,
since for most pairs of real datasets we do not know which attributes are
domain-specific and which are domain-invariant, the semantic correctness of
existing approaches has not been quantitatively evaluated yet. In this paper,
we propose a set of benchmarks and metrics for the evaluation of semantic
correctness of UMMI2I methods. We provide an extensive study how well the
existing state-of-the-art UMMI2I translation methods preserve domain-invariant
and manipulate domain-specific attributes, and discuss the trade-offs shared by
all methods, as well as how different architectural choices affect various
aspects of semantic correctness.
Related papers
- Improving Generalization of Image Captioning with Unsupervised Prompt
Learning [63.26197177542422]
Generalization of Image Captioning (GeneIC) learns a domain-specific prompt vector for the target domain without requiring annotated data.
GeneIC aligns visual and language modalities with a pre-trained Contrastive Language-Image Pre-Training (CLIP) model.
arXiv Detail & Related papers (2023-08-05T12:27:01Z) - Domain Agnostic Image-to-image Translation using Low-Resolution
Conditioning [6.470760375991825]
We propose a domain-agnostic i2i method for fine-grained problems, where the domains are related.
We present a novel approach that relies on training the generative model to produce images that both share distinctive information of the associated source image.
We validate our method on the CelebA-HQ and AFHQ datasets by demonstrating improvements in terms of visual quality.
arXiv Detail & Related papers (2023-05-08T19:58:49Z) - Disentangled Unsupervised Image Translation via Restricted Information
Flow [61.44666983942965]
Many state-of-art methods hard-code the desired shared-vs-specific split into their architecture.
We propose a new method that does not rely on inductive architectural biases.
We show that the proposed method achieves consistently high manipulation accuracy across two synthetic and one natural dataset.
arXiv Detail & Related papers (2021-11-26T00:27:54Z) - Semantic Consistency in Image-to-Image Translation for Unsupervised
Domain Adaptation [22.269565708490465]
Unsupervised Domain Adaptation (UDA) aims to adapt models trained on a source domain to a new target domain where no labelled data is available.
We propose a semantically consistent image-to-image translation method in combination with a consistency regularisation method for UDA.
arXiv Detail & Related papers (2021-11-05T14:22:20Z) - SMILE: Semantically-guided Multi-attribute Image and Layout Editing [154.69452301122175]
Attribute image manipulation has been a very active topic since the introduction of Generative Adversarial Networks (GANs)
We present a multimodal representation that handles all attributes, be it guided by random noise or images, while only using the underlying domain information of the target domain.
Our method is capable of adding, removing or changing either fine-grained or coarse attributes by using an image as a reference or by exploring the style distribution space.
arXiv Detail & Related papers (2020-10-05T20:15:21Z) - TriGAN: Image-to-Image Translation for Multi-Source Domain Adaptation [82.52514546441247]
We propose the first approach for Multi-Source Domain Adaptation (MSDA) based on Generative Adversarial Networks.
Our method is inspired by the observation that the appearance of a given image depends on three factors: the domain, the style and the content.
We test our approach using common MSDA benchmarks, showing that it outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-04-19T05:07:22Z) - Cross-domain Correspondence Learning for Exemplar-based Image
Translation [59.35767271091425]
We present a framework for exemplar-based image translation, which synthesizes a photo-realistic image from the input in a distinct domain.
The output has the style (e.g., color, texture) in consistency with the semantically corresponding objects in the exemplar.
We show that our method is superior to state-of-the-art methods in terms of image quality significantly.
arXiv Detail & Related papers (2020-04-12T09:10:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.