Global and Local Alignment Networks for Unpaired Image-to-Image
Translation
- URL: http://arxiv.org/abs/2111.10346v1
- Date: Fri, 19 Nov 2021 18:01:54 GMT
- Title: Global and Local Alignment Networks for Unpaired Image-to-Image
Translation
- Authors: Guanglei Yang, Hao Tang, Humphrey Shi, Mingli Ding, Nicu Sebe, Radu
Timofte, Luc Van Gool, Elisa Ricci
- Abstract summary: The goal of unpaired image-to-image translation is to produce an output image reflecting the target domain's style.
Due to the lack of attention to the content change in existing methods, semantic information from source images suffers from degradation during translation.
We introduce a novel approach, Global and Local Alignment Networks (GLA-Net)
Our method effectively generates sharper and more realistic images than existing approaches.
- Score: 170.08142745705575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The goal of unpaired image-to-image translation is to produce an output image
reflecting the target domain's style while keeping unrelated contents of the
input source image unchanged. However, due to the lack of attention to the
content change in existing methods, the semantic information from source images
suffers from degradation during translation. In the paper, to address this
issue, we introduce a novel approach, Global and Local Alignment Networks
(GLA-Net). The global alignment network aims to transfer the input image from
the source domain to the target domain. To effectively do so, we learn the
parameters (mean and standard deviation) of multivariate Gaussian distributions
as style features by using an MLP-Mixer based style encoder. To transfer the
style more accurately, we employ an adaptive instance normalization layer in
the encoder, with the parameters of the target multivariate Gaussian
distribution as input. We also adopt regularization and likelihood losses to
further reduce the domain gap and produce high-quality outputs. Additionally,
we introduce a local alignment network, which employs a pretrained
self-supervised model to produce an attention map via a novel local alignment
loss, ensuring that the translation network focuses on relevant pixels.
Extensive experiments conducted on five public datasets demonstrate that our
method effectively generates sharper and more realistic images than existing
approaches. Our code is available at https://github.com/ygjwd12345/GLANet.
Related papers
- Conditional Score Guidance for Text-Driven Image-to-Image Translation [52.73564644268749]
We present a novel algorithm for text-driven image-to-image translation based on a pretrained text-to-image diffusion model.
Our method aims to generate a target image by selectively editing the regions of interest in a source image.
arXiv Detail & Related papers (2023-05-29T10:48:34Z) - Masked and Adaptive Transformer for Exemplar Based Image Translation [16.93344592811513]
Cross-domain semantic matching is challenging.
We propose a masked and adaptive transformer (MAT) for learning accurate cross-domain correspondence.
We devise a novel contrastive style learning method, for acquire quality-discriminative style representations.
arXiv Detail & Related papers (2023-03-30T03:21:14Z) - Smooth image-to-image translations with latent space interpolations [64.8170758294427]
Multi-domain image-to-image (I2I) translations can transform a source image according to the style of a target domain.
We show that our regularization techniques can improve the state-of-the-art I2I translations by a large margin.
arXiv Detail & Related papers (2022-10-03T11:57:30Z) - Optimal transport meets noisy label robust loss and MixUp regularization
for domain adaptation [13.080485957000462]
Deep neural networks trained on a source training set perform poorly on target images which do not belong to the training domain.
One strategy to improve these performances is to align the source and target image distributions in an embedded space using optimal transport (OT)
We propose to couple the MixUp regularization citepzhang 2018mixup with a loss that is robust to noisy labels in order to improve domain adaptation performance.
arXiv Detail & Related papers (2022-06-22T15:40:52Z) - DSP: Dual Soft-Paste for Unsupervised Domain Adaptive Semantic
Segmentation [97.74059510314554]
Unsupervised domain adaptation (UDA) for semantic segmentation aims to adapt a segmentation model trained on the labeled source domain to the unlabeled target domain.
Existing methods try to learn domain invariant features while suffering from large domain gaps.
We propose a novel Dual Soft-Paste (DSP) method in this paper.
arXiv Detail & Related papers (2021-07-20T16:22:40Z) - Deep Symmetric Adaptation Network for Cross-modality Medical Image
Segmentation [40.95845629932874]
Unsupervised domain adaptation (UDA) methods have shown their promising performance in the cross-modality medical image segmentation tasks.
We present a novel deep symmetric architecture of UDA for medical image segmentation, which consists of a segmentation sub-network and two symmetric source and target domain translation sub-networks.
Our method has remarkable advantages compared to the state-of-the-art methods in both cross-modality Cardiac and BraTS segmentation tasks.
arXiv Detail & Related papers (2021-01-18T02:54:30Z) - Label-Driven Reconstruction for Domain Adaptation in Semantic
Segmentation [43.09068177612067]
Unsupervised domain adaptation enables to alleviate the need for pixel-wise annotation in the semantic segmentation.
One of the most common strategies is to translate images from the source domain to the target domain and then align their marginal distributions in the feature space using adversarial learning.
Here, we present an innovative framework, designed to mitigate the image translation bias and align cross-domain features with the same category.
arXiv Detail & Related papers (2020-03-10T10:06:35Z) - A U-Net Based Discriminator for Generative Adversarial Networks [86.67102929147592]
We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs)
The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images.
The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
arXiv Detail & Related papers (2020-02-28T11:16:54Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.