Effectively Leveraging Attributes for Visual Similarity
- URL: http://arxiv.org/abs/2105.01695v1
- Date: Tue, 4 May 2021 18:28:35 GMT
- Title: Effectively Leveraging Attributes for Visual Similarity
- Authors: Samarth Mishra, Zhongping Zhang, Yuan Shen, Ranjitha Kumar, Venkatesh
Saligrama, Bryan Plummer
- Abstract summary: We propose the Pairwise Attribute-informed similarity Network (PAN), which breaks similarity learning into capturing similarity conditions and relevance scores from a joint representation of two images.
PAN obtains a 4-9% improvement on compatibility prediction between clothing items on Polyvore Outfits, a 5% gain on few shot classification of images using Caltech-UCSD Birds (CUB), and over 1% boost to Recall@1 on In-Shop Clothes Retrieval.
- Score: 52.2646549020835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Measuring similarity between two images often requires performing complex
reasoning along different axes (e.g., color, texture, or shape). Insights into
what might be important for measuring similarity can can be provided by
annotated attributes, but prior work tends to view these annotations as
complete, resulting in them using a simplistic approach of predicting
attributes on single images, which are, in turn, used to measure similarity.
However, it is impractical for a dataset to fully annotate every attribute that
may be important. Thus, only representing images based on these incomplete
annotations may miss out on key information. To address this issue, we propose
the Pairwise Attribute-informed similarity Network (PAN), which breaks
similarity learning into capturing similarity conditions and relevance scores
from a joint representation of two images. This enables our model to identify
that two images contain the same attribute, but can have it deemed irrelevant
(e.g., due to fine-grained differences between them) and ignored for measuring
similarity between the two images. Notably, while prior methods of using
attribute annotations are often unable to outperform prior art, PAN obtains a
4-9% improvement on compatibility prediction between clothing items on Polyvore
Outfits, a 5\% gain on few shot classification of images using Caltech-UCSD
Birds (CUB), and over 1% boost to Recall@1 on In-Shop Clothes Retrieval.
Related papers
- PRISM: PRogressive dependency maxImization for Scale-invariant image Matching [4.9521269535586185]
We propose PRogressive dependency maxImization for Scale-invariant image Matching (PRISM)
Our method's superior matching performance and generalization capability are confirmed by leading accuracy across various evaluation benchmarks and downstream tasks.
arXiv Detail & Related papers (2024-08-07T07:35:17Z) - Interpretable Measures of Conceptual Similarity by
Complexity-Constrained Descriptive Auto-Encoding [112.0878081944858]
Quantifying the degree of similarity between images is a key copyright issue for image-based machine learning.
We seek to define and compute a notion of "conceptual similarity" among images that captures high-level relations.
Two highly dissimilar images can be discriminated early in their description, whereas conceptually dissimilar ones will need more detail to be distinguished.
arXiv Detail & Related papers (2024-02-14T03:31:17Z) - Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale
Fine-Grained Image Retrieval [65.43522019468976]
We propose attribute-aware hashing networks with self-consistency for generating attribute-aware hash codes.
We develop an encoder-decoder structure network of a reconstruction task to unsupervisedly distill high-level attribute-specific vectors.
Our models are equipped with a feature decorrelation constraint upon these attribute vectors to strengthen their representative abilities.
arXiv Detail & Related papers (2023-11-21T08:20:38Z) - Soft Neighbors are Positive Supporters in Contrastive Visual
Representation Learning [35.53729744330751]
Contrastive learning methods train visual encoders by comparing views from one instance to others.
This binary instance discrimination is studied extensively to improve feature representations in self-supervised learning.
In this paper, we rethink the instance discrimination framework and find the binary instance labeling insufficient to measure correlations between different samples.
arXiv Detail & Related papers (2023-03-30T04:22:07Z) - Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval [27.751399400911932]
We introduce an attribute-guided multi-level attention network (AG-MAN) for fine-grained fashion retrieval.
Specifically, we first enhance the pre-trained feature extractor to capture multi-level image embedding.
Then, we propose a classification scheme where images with the same attribute, albeit with different values, are categorized into the same class.
arXiv Detail & Related papers (2022-12-27T05:28:38Z) - FashionSearchNet-v2: Learning Attribute Representations with
Localization for Image Retrieval with Attribute Manipulation [22.691709684780292]
The proposed FashionSearchNet-v2 architecture is able to learn attribute specific representations by leveraging on its weakly-supervised localization module.
The network is jointly trained with the combination of attribute classification and triplet ranking loss to estimate local representations.
Experiments performed on several datasets that are rich in terms of the number of attributes show that FashionSearchNet-v2 outperforms the other state-of-the-art attribute manipulation techniques.
arXiv Detail & Related papers (2021-11-28T13:50:20Z) - Fine-Grained Fashion Similarity Prediction by Attribute-Specific
Embedding Learning [71.74073012364326]
We propose an Attribute-Specific Embedding Network (ASEN) to jointly learn multiple attribute-specific embeddings.
The proposed ASEN is comprised of a global branch and a local branch.
Experiments on three fashion-related datasets, i.e., FashionAI, DARN, and DeepFashion, show the effectiveness of ASEN for fine-grained fashion similarity prediction.
arXiv Detail & Related papers (2021-04-06T11:26:38Z) - Learning to Infer Unseen Attribute-Object Compositions [55.58107964602103]
A graph-based model is proposed that can flexibly recognize both single- and multi-attribute-object compositions.
We build a large-scale Multi-Attribute dataset with 116,099 images and 8,030 composition categories.
arXiv Detail & Related papers (2020-10-27T14:57:35Z) - Attribute Mix: Semantic Data Augmentation for Fine Grained Recognition [102.45926816660665]
We propose Attribute Mix, a data augmentation strategy at attribute level to expand the fine-grained samples.
The principle lies in that attribute features are shared among fine-grained sub-categories, and can be seamlessly transferred among images.
arXiv Detail & Related papers (2020-04-06T14:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.