GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image
Translation via Attribute Gaussian Mixture Modeling
- URL: http://arxiv.org/abs/2003.06788v2
- Date: Sat, 21 Mar 2020 22:15:26 GMT
- Title: GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image
Translation via Attribute Gaussian Mixture Modeling
- Authors: Yahui Liu, Marco De Nadai, Jian Yao, Nicu Sebe, Bruno Lepri, Xavier
Alameda-Pineda
- Abstract summary: Unsupervised image-to-image translation (UNIT) aims at learning a mapping between several visual domains by using unpaired training images.
Recent studies have shown remarkable success for multiple domains but they suffer from two main limitations.
We propose a method named GMM-UNIT, which is based on a content-attribute disentangled representation where the space is fitted with a GMM.
- Score: 66.50914391679375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised image-to-image translation (UNIT) aims at learning a mapping
between several visual domains by using unpaired training images. Recent
studies have shown remarkable success for multiple domains but they suffer from
two main limitations: they are either built from several two-domain mappings
that are required to be learned independently, or they generate low-diversity
results, a problem known as mode collapse. To overcome these limitations, we
propose a method named GMM-UNIT, which is based on a content-attribute
disentangled representation where the attribute space is fitted with a GMM.
Each GMM component represents a domain, and this simple assumption has two
prominent advantages. First, it can be easily extended to most multi-domain and
multi-modal image-to-image translation tasks. Second, the continuous domain
encoding allows for interpolation between domains and for extrapolation to
unseen domains and translations. Additionally, we show how GMM-UNIT can be
constrained down to different methods in the literature, meaning that GMM-UNIT
is a unifying framework for unsupervised image-to-image translation.
Related papers
- Domain-Scalable Unpaired Image Translation via Latent Space Anchoring [88.7642967393508]
Unpaired image-to-image translation (UNIT) aims to map images between two visual domains without paired training data.
We propose a new domain-scalable UNIT method, termed as latent space anchoring.
Our method anchors images of different domains to the same latent space of frozen GANs by learning lightweight encoder and regressor models.
In the inference phase, the learned encoders and decoders of different domains can be arbitrarily combined to translate images between any two domains without fine-tuning.
arXiv Detail & Related papers (2023-06-26T17:50:02Z) - I2F: A Unified Image-to-Feature Approach for Domain Adaptive Semantic
Segmentation [55.633859439375044]
Unsupervised domain adaptation (UDA) for semantic segmentation is a promising task freeing people from heavy annotation work.
Key idea to tackle this problem is to perform both image-level and feature-level adaptation jointly.
This paper proposes a novel UDA pipeline for semantic segmentation that unifies image-level and feature-level adaptation.
arXiv Detail & Related papers (2023-01-03T15:19:48Z) - Unsupervised Domain Adaptation for Semantic Segmentation using One-shot
Image-to-Image Translation via Latent Representation Mixing [9.118706387430883]
We propose a new unsupervised domain adaptation method for the semantic segmentation of very high resolution images.
An image-to-image translation paradigm is proposed, based on an encoder-decoder principle where latent content representations are mixed across domains.
Cross-city comparative experiments have shown that the proposed method outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2022-12-07T18:16:17Z) - Domain Invariant Masked Autoencoders for Self-supervised Learning from
Multi-domains [73.54897096088149]
We propose a Domain-invariant Masked AutoEncoder (DiMAE) for self-supervised learning from multi-domains.
The core idea is to augment the input image with style noise from different domains and then reconstruct the image from the embedding of the augmented image.
Experiments on PACS and DomainNet illustrate that DiMAE achieves considerable gains compared with recent state-of-the-art methods.
arXiv Detail & Related papers (2022-05-10T09:49:40Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - Disentangled Unsupervised Image Translation via Restricted Information
Flow [61.44666983942965]
Many state-of-art methods hard-code the desired shared-vs-specific split into their architecture.
We propose a new method that does not rely on inductive architectural biases.
We show that the proposed method achieves consistently high manipulation accuracy across two synthetic and one natural dataset.
arXiv Detail & Related papers (2021-11-26T00:27:54Z) - Crossing-Domain Generative Adversarial Networks for Unsupervised
Multi-Domain Image-to-Image Translation [12.692904507625036]
We propose a general framework for unsupervised image-to-image translation across multiple domains.
Our proposed framework consists of a pair of encoders along with a pair of GANs which learns high-level features across different domains to generate diverse and realistic samples from.
arXiv Detail & Related papers (2020-08-27T01:54:07Z) - SoloGAN: Multi-domain Multimodal Unpaired Image-to-Image Translation via
a Single Generative Adversarial Network [4.7344504314446345]
We present a flexible and general SoloGAN model for efficient multimodal I2I translation among multiple domains with unpaired data.
In contrast to existing methods, the SoloGAN algorithm uses a single projection discriminator with an additional auxiliary classifier and shares the encoder and generator for all domains.
arXiv Detail & Related papers (2020-08-04T16:31:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.