Multi-cropping Contrastive Learning and Domain Consistency for
Unsupervised Image-to-Image Translation
- URL: http://arxiv.org/abs/2304.12235v3
- Date: Wed, 5 Jul 2023 07:30:58 GMT
- Title: Multi-cropping Contrastive Learning and Domain Consistency for
Unsupervised Image-to-Image Translation
- Authors: Chen Zhao, Wei-Ling Cai, Zheng Yuan, Cheng-Wei Hu
- Abstract summary: We propose a novel unsupervised image-to-image translation framework based on multi-cropping contrastive learning and domain consistency, called MCDUT.
In many image-to-image translation tasks, our method achieves state-of-the-art results, and the advantages of our method have been proven through comparison experiments and ablation research.
- Score: 5.562419999563734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, unsupervised image-to-image translation methods based on
contrastive learning have achieved state-of-the-art results in many tasks.
However, in the previous works, the negatives are sampled from the input image
itself, which inspires us to design a data augmentation method to improve the
quality of the selected negatives. Moreover, the previous methods only preserve
the content consistency via patch-wise contrastive learning in the embedding
space, which ignores the domain consistency between the generated images and
the real images of the target domain. In this paper, we propose a novel
unsupervised image-to-image translation framework based on multi-cropping
contrastive learning and domain consistency, called MCDUT. Specifically, we
obtain the multi-cropping views via the center-cropping and the random-cropping
with the aim of further generating the high-quality negative examples. To
constrain the embeddings in the deep feature space, we formulate a new domain
consistency loss, which encourages the generated images to be close to the real
images in the embedding space of the same domain. Furthermore, we present a
dual coordinate attention network by embedding positional information into the
channel, which called DCA. We employ the DCA network in the design of
generator, which makes the generator capture the horizontal and vertical global
information of dependency. In many image-to-image translation tasks, our method
achieves state-of-the-art results, and the advantages of our method have been
proven through extensive comparison experiments and ablation research.
Related papers
- SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial
Network for an end-to-end image translation [18.93434486338439]
SCONE-GAN is shown to be effective for learning to generate realistic and diverse scenery images.
For more realistic and diverse image generation we introduce style reference image.
We validate the proposed algorithm for image-to-image translation and stylizing outdoor images.
arXiv Detail & Related papers (2023-11-07T10:29:16Z) - Domain Agnostic Image-to-image Translation using Low-Resolution
Conditioning [6.470760375991825]
We propose a domain-agnostic i2i method for fine-grained problems, where the domains are related.
We present a novel approach that relies on training the generative model to produce images that both share distinctive information of the associated source image.
We validate our method on the CelebA-HQ and AFHQ datasets by demonstrating improvements in terms of visual quality.
arXiv Detail & Related papers (2023-05-08T19:58:49Z) - ACE: Zero-Shot Image to Image Translation via Pretrained
Auto-Contrastive-Encoder [2.1874189959020427]
We propose a new approach to extract image features by learning the similarities and differences of samples within the same data distribution.
The design of ACE enables us to achieve zero-shot image-to-image translation with no training on image translation tasks for the first time.
Our model achieves competitive results on multimodal image translation tasks with zero-shot learning as well.
arXiv Detail & Related papers (2023-02-22T23:52:23Z) - Unsupervised Image-to-Image Translation with Generative Prior [103.54337984566877]
Unsupervised image-to-image translation aims to learn the translation between two visual domains without paired data.
We present a novel framework, Generative Prior-guided UN Image-to-image Translation (GP-UNIT), to improve the overall quality and applicability of the translation algorithm.
arXiv Detail & Related papers (2022-04-07T17:59:23Z) - Unsupervised Domain Adaptation with Contrastive Learning for OCT
Segmentation [49.59567529191423]
We propose a novel semi-supervised learning framework for segmentation of volumetric images from new unlabeled domains.
We jointly use supervised and contrastive learning, also introducing a contrastive pairing scheme that leverages similarity between nearby slices in 3D.
arXiv Detail & Related papers (2022-03-07T19:02:26Z) - Object-aware Contrastive Learning for Debiased Scene Representation [74.30741492814327]
We develop a novel object-aware contrastive learning framework that localizes objects in a self-supervised manner.
We also introduce two data augmentations based on ContraCAM, object-aware random crop and background mixup, which reduce contextual and background biases during contrastive self-supervised learning.
arXiv Detail & Related papers (2021-07-30T19:24:07Z) - Learning Unsupervised Cross-domain Image-to-Image Translation Using a
Shared Discriminator [2.1377923666134118]
Unsupervised image-to-image translation is used to transform images from a source domain to generate images in a target domain without using source-target image pairs.
We propose a new method that uses a single shared discriminator between the two GANs, which improves the overall efficacy.
Our results indicate that even without adding attention mechanisms, our method performs at par with attention-based methods and generates images of comparable quality.
arXiv Detail & Related papers (2021-02-09T08:26:23Z) - Image-to-image Mapping with Many Domains by Sparse Attribute Transfer [71.28847881318013]
Unsupervised image-to-image translation consists of learning a pair of mappings between two domains without known pairwise correspondences between points.
Current convention is to approach this task with cycle-consistent GANs.
We propose an alternate approach that directly restricts the generator to performing a simple sparse transformation in a latent layer.
arXiv Detail & Related papers (2020-06-23T19:52:23Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.