PI-Trans: Parallel-ConvMLP and Implicit-Transformation Based GAN for
Cross-View Image Translation
- URL: http://arxiv.org/abs/2207.04242v1
- Date: Sat, 9 Jul 2022 10:35:44 GMT
- Title: PI-Trans: Parallel-ConvMLP and Implicit-Transformation Based GAN for
Cross-View Image Translation
- Authors: Bin Ren, Hao Tang, Yiming Wang, Xia Li, Wei Wang, Nicu Sebe
- Abstract summary: We propose a novel generative adversarial network, PI-Trans, which consists of a novel Parallel-ConvMLP module and an Implicit Transformation module at multiple semantic levels.
PI-Trans achieves the best qualitative and quantitative performance by a large margin compared to the state-of-the-art methods on two challenging datasets.
- Score: 84.97160975101718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For semantic-guided cross-view image translation, it is crucial to learn
where to sample pixels from the source view image and where to reallocate them
guided by the target view semantic map, especially when there is little overlap
or drastic view difference between the source and target images. Hence, one not
only needs to encode the long-range dependencies among pixels in both the
source view image and target view the semantic map but also needs to translate
these learned dependencies. To this end, we propose a novel generative
adversarial network, PI-Trans, which mainly consists of a novel
Parallel-ConvMLP module and an Implicit Transformation module at multiple
semantic levels. Extensive experimental results show that the proposed PI-Trans
achieves the best qualitative and quantitative performance by a large margin
compared to the state-of-the-art methods on two challenging datasets. The code
will be made available at https://github.com/Amazingren/PI-Trans.
Related papers
- Exploring Multi-view Pixel Contrast for General and Robust Image Forgery Localization [4.8454936010479335]
We propose a Multi-view Pixel-wise Contrastive algorithm (MPC) for image forgery localization.
Specifically, we first pre-train the backbone network with the supervised contrastive loss.
Then the localization head is fine-tuned using the cross-entropy loss, resulting in a better pixel localizer.
arXiv Detail & Related papers (2024-06-19T13:51:52Z) - SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial
Network for an end-to-end image translation [18.93434486338439]
SCONE-GAN is shown to be effective for learning to generate realistic and diverse scenery images.
For more realistic and diverse image generation we introduce style reference image.
We validate the proposed algorithm for image-to-image translation and stylizing outdoor images.
arXiv Detail & Related papers (2023-11-07T10:29:16Z) - Multi-Curve Translator for Real-Time High-Resolution Image-to-Image
Translation [24.651984136294242]
Multi-Curve Translator (MCT) predicts translated pixels for corresponding input pixels and neighboring pixels.
MCT makes it possible to feed the network only the downsampled image to perform the mapping for the full-resolution image.
MCT variants can process 4K images in real-time and achieve comparable or even better performance than the base models.
arXiv Detail & Related papers (2022-03-15T10:06:39Z) - CRIS: CLIP-Driven Referring Image Segmentation [71.56466057776086]
We propose an end-to-end CLIP-Driven Referring Image framework (CRIS)
CRIS resorts to vision-language decoding and contrastive learning for achieving the text-to-pixel alignment.
Our proposed framework significantly outperforms the state-of-the-art performance without any post-processing.
arXiv Detail & Related papers (2021-11-30T07:29:08Z) - Global and Local Alignment Networks for Unpaired Image-to-Image
Translation [170.08142745705575]
The goal of unpaired image-to-image translation is to produce an output image reflecting the target domain's style.
Due to the lack of attention to the content change in existing methods, semantic information from source images suffers from degradation during translation.
We introduce a novel approach, Global and Local Alignment Networks (GLA-Net)
Our method effectively generates sharper and more realistic images than existing approaches.
arXiv Detail & Related papers (2021-11-19T18:01:54Z) - Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation [70.00392682183515]
It is hard to generate an image at target view well for previous cross-view image translation methods.
We propose a novel two-stage framework with a new Cascaded Cross-Mixer (CrossMLP) sub-network.
In the first stage, the CrossMLP sub-network learns the latent transformation cues between image code and semantic map code.
In the second stage, we design a refined pixel-level loss that eases the noisy semantic label problem.
arXiv Detail & Related papers (2021-10-19T18:03:30Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Multi-Channel Attention Selection GANs for Guided Image-to-Image
Translation [148.9985519929653]
We propose a novel model named Multi-Channel Attention Selection Generative Adversarial Network (SelectionGAN) for guided image-to-image translation.
The proposed framework and modules are unified solutions and can be applied to solve other generation tasks such as semantic image synthesis.
arXiv Detail & Related papers (2020-02-03T23:17:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.