Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings
- URL: http://arxiv.org/abs/2002.06661v1
- Date: Sun, 16 Feb 2020 19:49:30 GMT
- Title: Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings
- Authors: Shweta Mahajan, Iryna Gurevych, Stefan Roth
- Abstract summary: Learned joint representations of images and text form the backbone of several important cross-domain tasks such as image captioning.
We propose a novel semi-supervised framework, which models shared information between domains and domain-specific information separately.
We demonstrate the effectiveness of our model on diverse tasks, including image captioning and text-to-image synthesis.
- Score: 76.85673049332428
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learned joint representations of images and text form the backbone of several
important cross-domain tasks such as image captioning. Prior work mostly maps
both domains into a common latent representation in a purely supervised
fashion. This is rather restrictive, however, as the two domains follow
distinct generative processes. Therefore, we propose a novel semi-supervised
framework, which models shared information between domains and domain-specific
information separately. The information shared between the domains is aligned
with an invertible neural network. Our model integrates normalizing flow-based
priors for the domain-specific information, which allows us to learn diverse
many-to-many mappings between the two domains. We demonstrate the effectiveness
of our model on diverse tasks, including image captioning and text-to-image
synthesis.
Related papers
- Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval [55.122020263319634]
Video moment retrieval (VMR) aims to localize the target moment from an untrimmed video according to a given language query.
In this paper, we focus on a novel task: cross-domain VMR, where fully-annotated datasets are available in one domain but the domain of interest only contains unannotated datasets.
We propose a novel Multi-Modal Cross-Domain Alignment network to transfer the annotation knowledge from the source domain to the target domain.
arXiv Detail & Related papers (2022-09-23T12:58:20Z) - Dual-Domain Image Synthesis using Segmentation-Guided GAN [33.00724627120716]
We introduce a segmentation-guided approach to synthesise images that integrate features from two distinct domains.
Images synthesised by our dual-domain model belong to one domain within the semantic mask, and to another in the rest of the image.
arXiv Detail & Related papers (2022-04-19T17:25:54Z) - Unsupervised Domain Generalization by Learning a Bridge Across Domains [78.855606355957]
Unsupervised Domain Generalization (UDG) setup has no training supervision in neither source nor target domains.
Our approach is based on self-supervised learning of a Bridge Across Domains (BrAD) - an auxiliary bridge domain accompanied by a set of semantics preserving visual (image-to-image) mappings to BrAD from each of the training domains.
We show how using an edge-regularized BrAD our approach achieves significant gains across multiple benchmarks and a range of tasks, including UDG, Few-shot UDA, and unsupervised generalization across multi-domain datasets.
arXiv Detail & Related papers (2021-12-04T10:25:45Z) - Disentangled Unsupervised Image Translation via Restricted Information
Flow [61.44666983942965]
Many state-of-art methods hard-code the desired shared-vs-specific split into their architecture.
We propose a new method that does not rely on inductive architectural biases.
We show that the proposed method achieves consistently high manipulation accuracy across two synthetic and one natural dataset.
arXiv Detail & Related papers (2021-11-26T00:27:54Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z) - Variational Interaction Information Maximization for Cross-domain
Disentanglement [34.08140408283391]
Cross-domain disentanglement is the problem of learning representations partitioned into domain-invariant and domain-specific representations.
We cast the simultaneous learning of domain-invariant and domain-specific representations as a joint objective of multiple information constraints.
We show that our model achieves the state-of-the-art performance in the zero-shot sketch based image retrieval task.
arXiv Detail & Related papers (2020-12-08T07:11:35Z) - Unsupervised Wasserstein Distance Guided Domain Adaptation for 3D
Multi-Domain Liver Segmentation [14.639633860575621]
Unsupervised domain adaptation aims to improve network performance when applying robust models trained on medical images from source domains to a new target domain.
We present an approach based on the Wasserstein distance guided disentangled representation to achieve 3D multi-domain liver segmentation.
arXiv Detail & Related papers (2020-09-06T23:48:27Z) - Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation [56.94873619509414]
Conventional unsupervised domain adaptation studies the knowledge transfer between a limited number of domains.
We propose a novel Domain2Vec model to provide vectorial representations of visual domains based on joint learning of feature disentanglement and Gram matrix.
We demonstrate that our embedding is capable of predicting domain similarities that match our intuition about visual relations between different domains.
arXiv Detail & Related papers (2020-07-17T22:05:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.