Related papers: SynCDR : Training Cross Domain Retrieval Models with Synthetic Data

SynCDR : Training Cross Domain Retrieval Models with Synthetic Data

URL: http://arxiv.org/abs/2401.00420v2
Date: Tue, 19 Mar 2024 16:56:53 GMT
Title: SynCDR : Training Cross Domain Retrieval Models with Synthetic Data
Authors: Samarth Mishra, Carlos D. Castillo, Hongcheng Wang, Kate Saenko, Venkatesh Saligrama,
Abstract summary: In cross-domain retrieval, a model is required to identify images from the same semantic category across two visual domains. We show how to generate synthetic data to fill in these missing category examples across domains. Our best SynCDR model can outperform prior art by up to 15%.
Score: 69.26882668598587
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In cross-domain retrieval, a model is required to identify images from the same semantic category across two visual domains. For instance, given a sketch of an object, a model needs to retrieve a real image of it from an online store's catalog. A standard approach for such a problem is learning a feature space of images where Euclidean distances reflect similarity. Even without human annotations, which may be expensive to acquire, prior methods function reasonably well using unlabeled images for training. Our problem constraint takes this further to scenarios where the two domains do not necessarily share any common categories in training data. This can occur when the two domains in question come from different versions of some biometric sensor recording identities of different people. We posit a simple solution, which is to generate synthetic data to fill in these missing category examples across domains. This, we do via category preserving translation of images from one visual domain to another. We compare approaches specifically trained for this translation for a pair of domains, as well as those that can use large-scale pre-trained text-to-image diffusion models via prompts, and find that the latter can generate better replacement synthetic data, leading to more accurate cross-domain retrieval models. Our best SynCDR model can outperform prior art by up to 15\%. Code for our work is available at https://github.com/samarth4149/SynCDR .

Related papers

Hybrid diffusion models: combining supervised and generative pretraining for label-efficient fine-tuning of segmentation models [55.2480439325792]
We propose a new pretext task, which is to perform simultaneously image denoising and mask prediction on the first domain. We show that fine-tuning a model pretrained using this approach leads to better results than fine-tuning a similar model trained using either supervised or unsupervised pretraining.
arXiv Detail & Related papers (2024-08-06T20:19:06Z)
ZoDi: Zero-Shot Domain Adaptation with Diffusion-Based Image Transfer [13.956618446530559]
This paper proposes a zero-shot domain adaptation method based on diffusion models, called ZoDi. First, we utilize an off-the-shelf diffusion model to synthesize target-like images by transferring the domain of source images to the target domain. Secondly, we train the model using both source images and synthesized images with the original representations to learn domain-robust representations.
arXiv Detail & Related papers (2024-03-20T14:58:09Z)
Adapt Anything: Tailor Any Image Classifiers across Domains And Categories Using Text-to-Image Diffusion Models [82.95591765009105]
We aim to study if a modern text-to-image diffusion model can tailor any task-adaptive image classifier across domains and categories. We utilize only one off-the-shelf text-to-image model to synthesize images with category labels derived from the corresponding text prompts.
arXiv Detail & Related papers (2023-10-25T11:58:14Z)
Domain-Scalable Unpaired Image Translation via Latent Space Anchoring [88.7642967393508]
Unpaired image-to-image translation (UNIT) aims to map images between two visual domains without paired training data. We propose a new domain-scalable UNIT method, termed as latent space anchoring. Our method anchors images of different domains to the same latent space of frozen GANs by learning lightweight encoder and regressor models. In the inference phase, the learned encoders and decoders of different domains can be arbitrarily combined to translate images between any two domains without fine-tuning.
arXiv Detail & Related papers (2023-06-26T17:50:02Z)
Dual-Domain Image Synthesis using Segmentation-Guided GAN [33.00724627120716]
We introduce a segmentation-guided approach to synthesise images that integrate features from two distinct domains. Images synthesised by our dual-domain model belong to one domain within the semantic mask, and to another in the rest of the image.
arXiv Detail & Related papers (2022-04-19T17:25:54Z)
PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training [4.336877104987131]
Unsupervised domain adaptation is a promising technique for semantic segmentation. We present a novel framework for unsupervised domain adaptation based on the notion of target-domain consistency training. Our approach is simpler, easier to implement, and more memory-efficient during training.
arXiv Detail & Related papers (2021-05-17T19:36:28Z)
Semantic Distribution-aware Contrastive Adaptation for Semantic Segmentation [50.621269117524925]
Domain adaptive semantic segmentation refers to making predictions on a certain target domain with only annotations of a specific source domain. We present a semantic distribution-aware contrastive adaptation algorithm that enables pixel-wise representation alignment. We evaluate SDCA on multiple benchmarks, achieving considerable improvements over existing algorithms.
arXiv Detail & Related papers (2021-05-11T13:21:25Z)
Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels. We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images. We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z)
Learning High-Resolution Domain-Specific Representations with a GAN Generator [5.8720142291102135]
We show that representations learnt by a GAN generator can be easily projected onto semantic segmentation map using a lightweight decoder. We propose LayerMatch scheme for approximating the representation of a GAN generator that can be used for unsupervised domain-specific pretraining. We find that the use of LayerMatch-pretrained backbone leads to superior accuracy compared to standard supervised pretraining on ImageNet.
arXiv Detail & Related papers (2020-06-18T11:57:18Z)
Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation [19.617821473205694]
It is challenging for a model trained with synthetic data to generalize to real data. We diversity the texture of synthetic images using a style transfer algorithm. We fine-tune the model with self-training to get direct supervision of the target texture.
arXiv Detail & Related papers (2020-03-02T13:11:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.