Leveraging Local Domains for Image-to-Image Translation
- URL: http://arxiv.org/abs/2109.04468v1
- Date: Thu, 9 Sep 2021 17:59:52 GMT
- Title: Leveraging Local Domains for Image-to-Image Translation
- Authors: Anthony Dell'Eva, Fabio Pizzati, Massimo Bertozzi, Raoul de Charette
- Abstract summary: Image-to-image (i2i) networks struggle to capture local changes because they do not affect the global scene structure.
We leverage human knowledge about spatial domain characteristics which we refer to as 'local domains'
We train a patch-based GAN on few source data and hallucinate a new unseen domain which subsequently eases transfer learning to target.
- Score: 11.03611991082568
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image-to-image (i2i) networks struggle to capture local changes because they
do not affect the global scene structure. For example, translating from highway
scenes to offroad, i2i networks easily focus on global color features but
ignore obvious traits for humans like the absence of lane markings. In this
paper, we leverage human knowledge about spatial domain characteristics which
we refer to as 'local domains' and demonstrate its benefit for image-to-image
translation. Relying on a simple geometrical guidance, we train a patch-based
GAN on few source data and hallucinate a new unseen domain which subsequently
eases transfer learning to target. We experiment on three tasks ranging from
unstructured environments to adverse weather. Our comprehensive evaluation
setting shows we are able to generate realistic translations, with minimal
priors, and training only on a few images. Furthermore, when trained on our
translations images we show that all tested proxy tasks are significantly
improved, without ever seeing target domain at training.
Related papers
- Domain-Controlled Prompt Learning [49.45309818782329]
Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms.
We propose a textbfDomain-Controlled Prompt Learning for the specific domains.
Our method achieves state-of-the-art performance in specific domain image recognition datasets.
arXiv Detail & Related papers (2023-09-30T02:59:49Z) - Jurassic World Remake: Bringing Ancient Fossils Back to Life via
Zero-Shot Long Image-to-Image Translation [97.40572668025273]
We use text-guided latent diffusion models for zero-shot image-to-image translation (I2I) across large domain gaps.
Being able to perform translations across large domain gaps has a wide variety of real-world applications in criminology, astrology, environmental conservation, and paleontology.
arXiv Detail & Related papers (2023-08-14T17:59:31Z) - ACE: Zero-Shot Image to Image Translation via Pretrained
Auto-Contrastive-Encoder [2.1874189959020427]
We propose a new approach to extract image features by learning the similarities and differences of samples within the same data distribution.
The design of ACE enables us to achieve zero-shot image-to-image translation with no training on image translation tasks for the first time.
Our model achieves competitive results on multimodal image translation tasks with zero-shot learning as well.
arXiv Detail & Related papers (2023-02-22T23:52:23Z) - Using Language to Extend to Unseen Domains [81.37175826824625]
It is expensive to collect training data for every possible domain that a vision model may encounter when deployed.
We consider how simply verbalizing the training domain as well as domains we want to extend to but do not have data for can improve robustness.
Using a multimodal model with a joint image and language embedding space, our method LADS learns a transformation of the image embeddings from the training domain to each unseen test domain.
arXiv Detail & Related papers (2022-10-18T01:14:02Z) - Leveraging in-domain supervision for unsupervised image-to-image
translation tasks via multi-stream generators [4.726777092009554]
We introduce two techniques to incorporate this invaluable in-domain prior knowledge for the benefit of translation quality.
We propose splitting the input data according to semantic masks, explicitly guiding the network to different behavior for the different regions of the image.
In addition, we propose training a semantic segmentation network along with the translation task, and to leverage this output as a loss term that improves robustness.
arXiv Detail & Related papers (2021-12-30T15:29:36Z) - Self-Supervised Learning of Domain Invariant Features for Depth
Estimation [35.74969527929284]
We tackle the problem of unsupervised synthetic-to-realistic domain adaptation for single image depth estimation.
An essential building block of single image depth estimation is an encoder-decoder task network that takes RGB images as input and produces depth maps as output.
We propose a novel training strategy to force the task network to learn domain invariant representations in a self-supervised manner.
arXiv Detail & Related papers (2021-06-04T16:45:48Z) - The Spatially-Correlative Loss for Various Image Translation Tasks [69.62228639870114]
We propose a novel spatially-correlative loss that is simple, efficient and yet effective for preserving scene structure consistency.
Previous methods attempt this by using pixel-level cycle-consistency or feature-level matching losses.
We show distinct improvement over baseline models in all three modes of unpaired I2I translation: single-modal, multi-modal, and even single-image translation.
arXiv Detail & Related papers (2021-04-02T02:13:30Z) - Crossing-Domain Generative Adversarial Networks for Unsupervised
Multi-Domain Image-to-Image Translation [12.692904507625036]
We propose a general framework for unsupervised image-to-image translation across multiple domains.
Our proposed framework consists of a pair of encoders along with a pair of GANs which learns high-level features across different domains to generate diverse and realistic samples from.
arXiv Detail & Related papers (2020-08-27T01:54:07Z) - Spatial Attention Pyramid Network for Unsupervised Domain Adaptation [66.75008386980869]
Unsupervised domain adaptation is critical in various computer vision tasks.
We design a new spatial attention pyramid network for unsupervised domain adaptation.
Our method performs favorably against the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-03-29T09:03:23Z) - CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [119.45667331836583]
Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another.
We present a novel pixel-wise adversarial domain adaptation algorithm.
arXiv Detail & Related papers (2020-01-09T19:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.