MCMI: Multi-Cycle Image Translation with Mutual Information Constraints
- URL: http://arxiv.org/abs/2007.02919v1
- Date: Mon, 6 Jul 2020 17:50:43 GMT
- Title: MCMI: Multi-Cycle Image Translation with Mutual Information Constraints
- Authors: Xiang Xu, Megha Nawhal, Greg Mori, Manolis Savva
- Abstract summary: We present a mutual information-based framework for unsupervised image-to-image translation.
Our MCMI approach treats single-cycle image translation models as modules that can be used recurrently in a multi-cycle translation setting.
We show that models trained with MCMI produce higher quality images and learn more semantically-relevant mappings.
- Score: 40.556049046897115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a mutual information-based framework for unsupervised
image-to-image translation. Our MCMI approach treats single-cycle image
translation models as modules that can be used recurrently in a multi-cycle
translation setting where the translation process is bounded by mutual
information constraints between the input and output images. The proposed
mutual information constraints can improve cross-domain mappings by optimizing
out translation functions that fail to satisfy the Markov property during image
translations. We show that models trained with MCMI produce higher quality
images and learn more semantically-relevant mappings compared to
state-of-the-art image translation methods. The MCMI framework can be applied
to existing unpaired image-to-image translation models with minimum
modifications. Qualitative experiments and a perceptual study demonstrate the
image quality improvements and generality of our approach using several
backbone models and a variety of image datasets.
Related papers
- MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling [64.09238330331195]
We propose a novel Multi-Modal Auto-Regressive (MMAR) probabilistic modeling framework.
Unlike discretization line of method, MMAR takes in continuous-valued image tokens to avoid information loss.
We show that MMAR demonstrates much more superior performance than other joint multi-modal models.
arXiv Detail & Related papers (2024-10-14T17:57:18Z) - Variational Bayesian Framework for Advanced Image Generation with
Domain-Related Variables [29.827191184889898]
We present a unified Bayesian framework for advanced conditional generative problems.
We propose a variational Bayesian image translation network (VBITN) that enables multiple image translation and editing tasks.
arXiv Detail & Related papers (2023-05-23T09:47:23Z) - Tackling Ambiguity with Images: Improved Multimodal Machine Translation
and Contrastive Evaluation [72.6667341525552]
We present a new MMT approach based on a strong text-only MT model, which uses neural adapters and a novel guided self-attention mechanism.
We also introduce CoMMuTE, a Contrastive Multimodal Translation Evaluation set of ambiguous sentences and their possible translations.
Our approach obtains competitive results compared to strong text-only models on standard English-to-French, English-to-German and English-to-Czech benchmarks.
arXiv Detail & Related papers (2022-12-20T10:18:18Z) - Vector Quantized Image-to-Image Translation [31.65282783830092]
We propose introducing the vector quantization technique into the image-to-image translation framework.
Our framework achieves comparable performance to the state-of-the-art image-to-image translation and image extension methods.
arXiv Detail & Related papers (2022-07-27T04:22:29Z) - Marginal Contrastive Correspondence for Guided Image Generation [58.0605433671196]
Exemplar-based image translation establishes dense correspondences between a conditional input and an exemplar from two different domains.
Existing work builds the cross-domain correspondences implicitly by minimizing feature-wise distances across the two domains.
We design a Marginal Contrastive Learning Network (MCL-Net) that explores contrastive learning to learn domain-invariant features for realistic exemplar-based image translation.
arXiv Detail & Related papers (2022-04-01T13:55:44Z) - Unbalanced Feature Transport for Exemplar-based Image Translation [51.54421432912801]
This paper presents a general image translation framework that incorporates optimal transport for feature alignment between conditional inputs and style exemplars in image translation.
We show that our method achieves superior image translation qualitatively and quantitatively as compared with the state-of-the-art.
arXiv Detail & Related papers (2021-06-19T12:07:48Z) - Smoothing the Disentangled Latent Style Space for Unsupervised
Image-to-Image Translation [56.55178339375146]
Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic results.
We propose a new training protocol based on three specific losses which help a translation network to learn a smooth and disentangled latent style space.
arXiv Detail & Related papers (2021-06-16T17:58:21Z) - Gumbel-Attention for Multi-modal Machine Translation [18.4381138617661]
Multi-modal machine translation (MMT) improves translation quality by introducing visual information.
The existing MMT model ignores the problem that the image will bring information irrelevant to the text, causing much noise to the model and affecting the translation quality.
We propose a novel Gumbel-Attention for multi-modal machine translation, which selects the text-related parts of the image features.
arXiv Detail & Related papers (2021-03-16T05:44:01Z) - Fusion Models for Improved Visual Captioning [18.016295296424413]
This paper proposes a generic multimodal model fusion framework for caption generation and emendation.
We employ the same fusion strategies to integrate a pretrained Masked Language Model (MLM) with a visual captioning model, viz. Show, Attend, and Tell.
Our caption emendation experiments on three benchmark image captioning datasets, viz. Flickr8k, Flickr30k, and MSCOCO, show improvements over the baseline.
arXiv Detail & Related papers (2020-10-28T21:55:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.