Domain-Scalable Unpaired Image Translation via Latent Space Anchoring
- URL: http://arxiv.org/abs/2306.14879v1
- Date: Mon, 26 Jun 2023 17:50:02 GMT
- Title: Domain-Scalable Unpaired Image Translation via Latent Space Anchoring
- Authors: Siyu Huang, Jie An, Donglai Wei, Zudi Lin, Jiebo Luo, Hanspeter
Pfister
- Abstract summary: Unpaired image-to-image translation (UNIT) aims to map images between two visual domains without paired training data.
We propose a new domain-scalable UNIT method, termed as latent space anchoring.
Our method anchors images of different domains to the same latent space of frozen GANs by learning lightweight encoder and regressor models.
In the inference phase, the learned encoders and decoders of different domains can be arbitrarily combined to translate images between any two domains without fine-tuning.
- Score: 88.7642967393508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unpaired image-to-image translation (UNIT) aims to map images between two
visual domains without paired training data. However, given a UNIT model
trained on certain domains, it is difficult for current methods to incorporate
new domains because they often need to train the full model on both existing
and new domains. To address this problem, we propose a new domain-scalable UNIT
method, termed as latent space anchoring, which can be efficiently extended to
new visual domains and does not need to fine-tune encoders and decoders of
existing domains. Our method anchors images of different domains to the same
latent space of frozen GANs by learning lightweight encoder and regressor
models to reconstruct single-domain images. In the inference phase, the learned
encoders and decoders of different domains can be arbitrarily combined to
translate images between any two domains without fine-tuning. Experiments on
various datasets show that the proposed method achieves superior performance on
both standard and domain-scalable UNIT tasks in comparison with the
state-of-the-art methods.
Related papers
- Reducing Domain Gap in Frequency and Spatial domain for Cross-modality
Domain Adaptation on Medical Image Segmentation [5.371816551086118]
Unsupervised domain adaptation (UDA) aims to learn a model trained on source domain and performs well on unlabeled target domain.
We propose a simple yet effective UDA method based on frequency and spatial domain transfer uner multi-teacher distillation framework.
Our proposed method achieves superior performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-11-28T11:35:39Z) - Using Language to Extend to Unseen Domains [81.37175826824625]
It is expensive to collect training data for every possible domain that a vision model may encounter when deployed.
We consider how simply verbalizing the training domain as well as domains we want to extend to but do not have data for can improve robustness.
Using a multimodal model with a joint image and language embedding space, our method LADS learns a transformation of the image embeddings from the training domain to each unseen test domain.
arXiv Detail & Related papers (2022-10-18T01:14:02Z) - Multi-Scale Multi-Target Domain Adaptation for Angle Closure
Classification [50.658613573816254]
We propose a novel Multi-scale Multi-target Domain Adversarial Network (M2DAN) for angle closure classification.
Based on these domain-invariant features at different scales, the deep model trained on the source domain is able to classify angle closure on multiple target domains.
arXiv Detail & Related papers (2022-08-25T15:27:55Z) - Domain Invariant Masked Autoencoders for Self-supervised Learning from
Multi-domains [73.54897096088149]
We propose a Domain-invariant Masked AutoEncoder (DiMAE) for self-supervised learning from multi-domains.
The core idea is to augment the input image with style noise from different domains and then reconstruct the image from the embedding of the augmented image.
Experiments on PACS and DomainNet illustrate that DiMAE achieves considerable gains compared with recent state-of-the-art methods.
arXiv Detail & Related papers (2022-05-10T09:49:40Z) - Dual-Domain Image Synthesis using Segmentation-Guided GAN [33.00724627120716]
We introduce a segmentation-guided approach to synthesise images that integrate features from two distinct domains.
Images synthesised by our dual-domain model belong to one domain within the semantic mask, and to another in the rest of the image.
arXiv Detail & Related papers (2022-04-19T17:25:54Z) - Semantic Consistency in Image-to-Image Translation for Unsupervised
Domain Adaptation [22.269565708490465]
Unsupervised Domain Adaptation (UDA) aims to adapt models trained on a source domain to a new target domain where no labelled data is available.
We propose a semantically consistent image-to-image translation method in combination with a consistency regularisation method for UDA.
arXiv Detail & Related papers (2021-11-05T14:22:20Z) - Crossing-Domain Generative Adversarial Networks for Unsupervised
Multi-Domain Image-to-Image Translation [12.692904507625036]
We propose a general framework for unsupervised image-to-image translation across multiple domains.
Our proposed framework consists of a pair of encoders along with a pair of GANs which learns high-level features across different domains to generate diverse and realistic samples from.
arXiv Detail & Related papers (2020-08-27T01:54:07Z) - GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image
Translation via Attribute Gaussian Mixture Modeling [66.50914391679375]
Unsupervised image-to-image translation (UNIT) aims at learning a mapping between several visual domains by using unpaired training images.
Recent studies have shown remarkable success for multiple domains but they suffer from two main limitations.
We propose a method named GMM-UNIT, which is based on a content-attribute disentangled representation where the space is fitted with a GMM.
arXiv Detail & Related papers (2020-03-15T10:18:56Z) - Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings [76.85673049332428]
Learned joint representations of images and text form the backbone of several important cross-domain tasks such as image captioning.
We propose a novel semi-supervised framework, which models shared information between domains and domain-specific information separately.
We demonstrate the effectiveness of our model on diverse tasks, including image captioning and text-to-image synthesis.
arXiv Detail & Related papers (2020-02-16T19:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.