Fuse and Attend: Generalized Embedding Learning for Art and Sketches
- URL: http://arxiv.org/abs/2208.09698v1
- Date: Sat, 20 Aug 2022 14:44:11 GMT
- Title: Fuse and Attend: Generalized Embedding Learning for Art and Sketches
- Authors: Ujjal Kr Dutta
- Abstract summary: We propose a novel Embedding Learning approach with the goal of generalizing across different domains.
We show the prowess of our method using the Domain framework, on the popular PACS (Photo, Art painting, Cartoon, and Sketch) dataset.
- Score: 6.375982344506753
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: While deep Embedding Learning approaches have witnessed widespread success in
multiple computer vision tasks, the state-of-the-art methods for representing
natural images need not necessarily perform well on images from other domains,
such as paintings, cartoons, and sketch. This is because of the huge shift in
the distribution of data from across these domains, as compared to natural
images. Domains like sketch often contain sparse informative pixels. However,
recognizing objects in such domains is crucial, given multiple relevant
applications leveraging such data, for instance, sketch to image retrieval.
Thus, achieving an Embedding Learning model that could perform well across
multiple domains is not only challenging, but plays a pivotal role in computer
vision. To this end, in this paper, we propose a novel Embedding Learning
approach with the goal of generalizing across different domains. During
training, given a query image from a domain, we employ gated fusion and
attention to generate a positive example, which carries a broad notion of the
semantics of the query object category (from across multiple domains). By
virtue of Contrastive Learning, we pull the embeddings of the query and
positive, in order to learn a representation which is robust across domains. At
the same time, to teach the model to be discriminative against examples from
different semantic categories (across domains), we also maintain a pool of
negative embeddings (from different categories). We show the prowess of our
method using the DomainBed framework, on the popular PACS (Photo, Art painting,
Cartoon, and Sketch) dataset.
Related papers
- Leveraging Open-Vocabulary Diffusion to Camouflaged Instance
Segmentation [59.78520153338878]
Text-to-image diffusion techniques have shown exceptional capability of producing high-quality images from text descriptions.
We propose a method built upon a state-of-the-art diffusion model, empowered by open-vocabulary to learn multi-scale textual-visual features for camouflaged object representations.
arXiv Detail & Related papers (2023-12-29T07:59:07Z) - Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval [85.39613457282107]
Cross-domain nature of sketch-based image retrieval is challenging.
We present an effective Adapt and Align'' approach to address the key challenges.
Inspired by recent advances in image-text foundation models (e.g., CLIP) on zero-shot scenarios, we explicitly align the learned image embedding with a more semantic text embedding to achieve the desired knowledge transfer from seen to unseen classes.
arXiv Detail & Related papers (2023-05-09T03:10:15Z) - Domain-invariant Prototypes for Semantic Segmentation [30.932130453313537]
We present an easy-to-train framework that learns domain-invariant prototypes for domain adaptive semantic segmentation.
Our method involves only one-stage training and does not need to be trained on large-scale un-annotated target images.
arXiv Detail & Related papers (2022-08-12T02:21:05Z) - Unsupervised Domain Generalization by Learning a Bridge Across Domains [78.855606355957]
Unsupervised Domain Generalization (UDG) setup has no training supervision in neither source nor target domains.
Our approach is based on self-supervised learning of a Bridge Across Domains (BrAD) - an auxiliary bridge domain accompanied by a set of semantics preserving visual (image-to-image) mappings to BrAD from each of the training domains.
We show how using an edge-regularized BrAD our approach achieves significant gains across multiple benchmarks and a range of tasks, including UDG, Few-shot UDA, and unsupervised generalization across multi-domain datasets.
arXiv Detail & Related papers (2021-12-04T10:25:45Z) - Self-Supervised Learning of Domain Invariant Features for Depth
Estimation [35.74969527929284]
We tackle the problem of unsupervised synthetic-to-realistic domain adaptation for single image depth estimation.
An essential building block of single image depth estimation is an encoder-decoder task network that takes RGB images as input and produces depth maps as output.
We propose a novel training strategy to force the task network to learn domain invariant representations in a self-supervised manner.
arXiv Detail & Related papers (2021-06-04T16:45:48Z) - Extending and Analyzing Self-Supervised Learning Across Domains [50.13326427158233]
Self-supervised representation learning has achieved impressive results in recent years.
Experiments primarily come on ImageNet or other similarly large internet imagery datasets.
We experiment with several popular methods on an unprecedented variety of domains.
arXiv Detail & Related papers (2020-04-24T21:18:02Z) - Unifying Specialist Image Embedding into Universal Image Embedding [84.0039266370785]
It is desirable to have a universal deep embedding model applicable to various domains of images.
We propose to distill the knowledge in multiple specialists into a universal embedding to solve this problem.
arXiv Detail & Related papers (2020-03-08T02:51:11Z) - Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings [76.85673049332428]
Learned joint representations of images and text form the backbone of several important cross-domain tasks such as image captioning.
We propose a novel semi-supervised framework, which models shared information between domains and domain-specific information separately.
We demonstrate the effectiveness of our model on diverse tasks, including image captioning and text-to-image synthesis.
arXiv Detail & Related papers (2020-02-16T19:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.