SoloGAN: Multi-domain Multimodal Unpaired Image-to-Image Translation via
a Single Generative Adversarial Network
- URL: http://arxiv.org/abs/2008.01681v3
- Date: Tue, 28 Jun 2022 18:35:53 GMT
- Title: SoloGAN: Multi-domain Multimodal Unpaired Image-to-Image Translation via
a Single Generative Adversarial Network
- Authors: Shihua Huang, Cheng He, Ran Cheng
- Abstract summary: We present a flexible and general SoloGAN model for efficient multimodal I2I translation among multiple domains with unpaired data.
In contrast to existing methods, the SoloGAN algorithm uses a single projection discriminator with an additional auxiliary classifier and shares the encoder and generator for all domains.
- Score: 4.7344504314446345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite significant advances in image-to-image (I2I) translation with
generative adversarial networks (GANs), it remains challenging to effectively
translate an image to a set of diverse images in multiple target domains using
a single pair of generator and discriminator. Existing I2I translation methods
adopt multiple domain-specific content encoders for different domains, where
each domain-specific content encoder is trained with images from the same
domain only. Nevertheless, we argue that the content (domain-invariance)
features should be learned from images among all of the domains. Consequently,
each domain-specific content encoder of existing schemes fails to extract the
domain-invariant features efficiently. To address this issue, we present a
flexible and general SoloGAN model for efficient multimodal I2I translation
among multiple domains with unpaired data. In contrast to existing methods, the
SoloGAN algorithm uses a single projection discriminator with an additional
auxiliary classifier and shares the encoder and generator for all domains.
Consequently, the SoloGAN can be trained effectively with images from all
domains such that the domain-invariance content representation can be
efficiently extracted. Qualitative and quantitative results over a wide range
of datasets against several counterparts and variants of the SoloGAN
demonstrate the merits of the method, especially for challenging I2I
translation datasets, i.e., datasets involving extreme shape variations or need
to keep the complex backgrounds unchanged after translations. Furthermore, we
demonstrate the contribution of each component in SoloGAN by ablation studies.
Related papers
- I2I-Galip: Unsupervised Medical Image Translation Using Generative Adversarial CLIP [30.506544165999564]
Unpaired image-to-image translation is a challenging task due to the absence of paired examples.
We propose a new image-to-image translation framework named Image-to-Image-Generative-Adversarial-CLIP (I2I-Galip)
arXiv Detail & Related papers (2024-09-19T01:44:50Z) - WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization [63.98650220772378]
We present WIDIn, Wording Images for Domain-Invariant representation, to disentangle discriminative visual representation.
We first estimate the language embedding with fine-grained alignment, which can be used to adaptively identify and then remove domain-specific counterpart.
We show that WIDIn can be applied to both pretrained vision-language models like CLIP, and separately trained uni-modal models like MoCo and BERT.
arXiv Detail & Related papers (2024-05-28T17:46:27Z) - Domain-Scalable Unpaired Image Translation via Latent Space Anchoring [88.7642967393508]
Unpaired image-to-image translation (UNIT) aims to map images between two visual domains without paired training data.
We propose a new domain-scalable UNIT method, termed as latent space anchoring.
Our method anchors images of different domains to the same latent space of frozen GANs by learning lightweight encoder and regressor models.
In the inference phase, the learned encoders and decoders of different domains can be arbitrarily combined to translate images between any two domains without fine-tuning.
arXiv Detail & Related papers (2023-06-26T17:50:02Z) - Multi-Scale Multi-Target Domain Adaptation for Angle Closure
Classification [50.658613573816254]
We propose a novel Multi-scale Multi-target Domain Adversarial Network (M2DAN) for angle closure classification.
Based on these domain-invariant features at different scales, the deep model trained on the source domain is able to classify angle closure on multiple target domains.
arXiv Detail & Related papers (2022-08-25T15:27:55Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - Disentangled Unsupervised Image Translation via Restricted Information
Flow [61.44666983942965]
Many state-of-art methods hard-code the desired shared-vs-specific split into their architecture.
We propose a new method that does not rely on inductive architectural biases.
We show that the proposed method achieves consistently high manipulation accuracy across two synthetic and one natural dataset.
arXiv Detail & Related papers (2021-11-26T00:27:54Z) - Crossing-Domain Generative Adversarial Networks for Unsupervised
Multi-Domain Image-to-Image Translation [12.692904507625036]
We propose a general framework for unsupervised image-to-image translation across multiple domains.
Our proposed framework consists of a pair of encoders along with a pair of GANs which learns high-level features across different domains to generate diverse and realistic samples from.
arXiv Detail & Related papers (2020-08-27T01:54:07Z) - Multi-Domain Image Completion for Random Missing Input Data [17.53581223279953]
Multi-domain data are widely leveraged in vision applications taking advantage of complementary information from different modalities.
Due to possible data corruption and different imaging protocols, the availability of images for each domain could vary amongst multiple data sources.
We propose a general approach to complete the random missing domain(s) data in real applications.
arXiv Detail & Related papers (2020-07-10T16:38:48Z) - GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image
Translation via Attribute Gaussian Mixture Modeling [66.50914391679375]
Unsupervised image-to-image translation (UNIT) aims at learning a mapping between several visual domains by using unpaired training images.
Recent studies have shown remarkable success for multiple domains but they suffer from two main limitations.
We propose a method named GMM-UNIT, which is based on a content-attribute disentangled representation where the space is fitted with a GMM.
arXiv Detail & Related papers (2020-03-15T10:18:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.