Few-Shot Unsupervised Image-to-Image Translation on complex scenes
- URL: http://arxiv.org/abs/2106.03770v1
- Date: Mon, 7 Jun 2021 16:33:19 GMT
- Title: Few-Shot Unsupervised Image-to-Image Translation on complex scenes
- Authors: Luca Barras, Samuel Chassot, Daniel Filipe Nunes Silva
- Abstract summary: In this work, we assess how a method that has initially been developed for single object translation performs on more diverse and content-rich images.
We present a way to extend a dataset based on object detection. Moreover, we propose a way to adapt the FUNIT framework in order to leverage the power of object detection.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Unsupervised image-to-image translation methods have received a lot of
attention in the last few years. Multiple techniques emerged tackling the
initial challenge from different perspectives. Some focus on learning as much
as possible from several target style images for translations while other make
use of object detection in order to produce more realistic results on
content-rich scenes. In this work, we assess how a method that has initially
been developed for single object translation performs on more diverse and
content-rich images. Our work is based on the FUNIT[1] framework and we train
it with a more diverse dataset. This helps understanding how such method
behaves beyond their initial frame of application. We present a way to extend a
dataset based on object detection. Moreover, we propose a way to adapt the
FUNIT framework in order to leverage the power of object detection that one can
see in other methods.
Related papers
- SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial
Network for an end-to-end image translation [18.93434486338439]
SCONE-GAN is shown to be effective for learning to generate realistic and diverse scenery images.
For more realistic and diverse image generation we introduce style reference image.
We validate the proposed algorithm for image-to-image translation and stylizing outdoor images.
arXiv Detail & Related papers (2023-11-07T10:29:16Z) - DALL-E for Detection: Language-driven Context Image Synthesis for Object
Detection [18.276823176045525]
We propose a new paradigm for automatic context image generation at scale.
At the core of our approach lies utilizing an interplay between language description of context and language-driven image generation.
We demonstrate the advantages of our approach over the prior context image generation approaches on four object detection datasets.
arXiv Detail & Related papers (2022-06-20T06:43:17Z) - Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone [170.85076677740292]
We present FIBER (Fusion-In-the-Backbone-basedER), a new model architecture for vision-language (VL) pre-training.
Instead of having dedicated transformer layers for fusion after the uni-modal backbones, FIBER pushes multimodal fusion deep into the model.
We conduct comprehensive experiments on a wide range of VL tasks, ranging from VQA, image captioning, and retrieval, to phrase grounding, referring expression comprehension, and object detection.
arXiv Detail & Related papers (2022-06-15T16:41:29Z) - Unsupervised Image-to-Image Translation with Generative Prior [103.54337984566877]
Unsupervised image-to-image translation aims to learn the translation between two visual domains without paired data.
We present a novel framework, Generative Prior-guided UN Image-to-image Translation (GP-UNIT), to improve the overall quality and applicability of the translation algorithm.
arXiv Detail & Related papers (2022-04-07T17:59:23Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - Panoptic-based Object Style-Align for Image-to-Image Translation [2.226472061870956]
We propose panoptic-based object style-align generative adversarial networks (POSA-GANs) for image-to-image translation.
The proposed method was systematically compared with different competing methods and obtained significant improvement on both image quality and object recognition performance for translated images.
arXiv Detail & Related papers (2021-12-03T14:28:11Z) - A Simple and Effective Use of Object-Centric Images for Long-Tailed
Object Detection [56.82077636126353]
We take advantage of object-centric images to improve object detection in scene-centric images.
We present a simple yet surprisingly effective framework to do so.
Our approach can improve the object detection (and instance segmentation) accuracy of rare objects by 50% (and 33%) relatively.
arXiv Detail & Related papers (2021-02-17T17:27:21Z) - Image Translation via Fine-grained Knowledge Transfer [36.898373109689814]
We propose an interpretable knowledge-based image-translation framework, which realizes the image-translation through knowledge retrieval and transfer.
In details, the framework constructs a plug-and-play and model-agnostic general purpose knowledge library, remembering task-specific styles, tones, texture patterns, etc.
arXiv Detail & Related papers (2020-12-21T09:18:48Z) - Contrastive Learning for Unpaired Image-to-Image Translation [64.47477071705866]
In image-to-image translation, each patch in the output should reflect the content of the corresponding patch in the input, independent of domain.
We propose a framework based on contrastive learning to maximize mutual information between the two.
We demonstrate that our framework enables one-sided translation in the unpaired image-to-image translation setting, while improving quality and reducing training time.
arXiv Detail & Related papers (2020-07-30T17:59:58Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.