Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale
Fine-Grained Image Retrieval
- URL: http://arxiv.org/abs/2311.12894v1
- Date: Tue, 21 Nov 2023 08:20:38 GMT
- Title: Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale
Fine-Grained Image Retrieval
- Authors: Xiu-Shen Wei and Yang Shen and Xuhao Sun and Peng Wang and Yuxin Peng
- Abstract summary: We propose attribute-aware hashing networks with self-consistency for generating attribute-aware hash codes.
We develop an encoder-decoder structure network of a reconstruction task to unsupervisedly distill high-level attribute-specific vectors.
Our models are equipped with a feature decorrelation constraint upon these attribute vectors to strengthen their representative abilities.
- Score: 65.43522019468976
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Our work focuses on tackling large-scale fine-grained image retrieval as
ranking the images depicting the concept of interests (i.e., the same
sub-category labels) highest based on the fine-grained details in the query. It
is desirable to alleviate the challenges of both fine-grained nature of small
inter-class variations with large intra-class variations and explosive growth
of fine-grained data for such a practical task. In this paper, we propose
attribute-aware hashing networks with self-consistency for generating
attribute-aware hash codes to not only make the retrieval process efficient,
but also establish explicit correspondences between hash codes and visual
attributes. Specifically, based on the captured visual representations by
attention, we develop an encoder-decoder structure network of a reconstruction
task to unsupervisedly distill high-level attribute-specific vectors from the
appearance-specific visual representations without attribute annotations. Our
models are also equipped with a feature decorrelation constraint upon these
attribute vectors to strengthen their representative abilities. Then, driven by
preserving original entities' similarity, the required hash codes can be
generated from these attribute-specific vectors and thus become
attribute-aware. Furthermore, to combat simplicity bias in deep hashing, we
consider the model design from the perspective of the self-consistency
principle and propose to further enhance models' self-consistency by equipping
an additional image reconstruction path. Comprehensive quantitative experiments
under diverse empirical settings on six fine-grained retrieval datasets and two
generic retrieval datasets show the superiority of our models over competing
methods.
Related papers
- Enhancing Fine-Grained Visual Recognition in the Low-Data Regime Through Feature Magnitude Regularization [23.78498670529746]
We introduce a regularization technique to ensure that the magnitudes of the extracted features are evenly distributed.
Despite its apparent simplicity, our approach has demonstrated significant performance improvements across various fine-grained visual recognition datasets.
arXiv Detail & Related papers (2024-09-03T07:32:46Z) - Attributes Grouping and Mining Hashing for Fine-Grained Image Retrieval [24.8065557159198]
We propose an Attributes Grouping and Mining Hashing (AGMH) for fine-grained image retrieval.
AGMH groups and embeds the category-specific visual attributes in multiple descriptors to generate a comprehensive feature representation.
AGMH consistently yields the best performance against state-of-the-art methods on fine-grained benchmark datasets.
arXiv Detail & Related papers (2023-11-10T14:01:56Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Learning Structured Output Representations from Attributes using Deep
Conditional Generative Models [0.0]
This paper recreates the Conditional Variational Auto-encoder architecture and trains it on images conditioned on attributes.
We attempt to generate new faces with distinct attributes such as hair color and glasses, as well as different bird species samples.
arXiv Detail & Related papers (2023-04-30T17:25:31Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Contextual Similarity Aggregation with Self-attention for Visual
Re-ranking [96.55393026011811]
We propose a visual re-ranking method by contextual similarity aggregation with self-attention.
We conduct comprehensive experiments on four benchmark datasets to demonstrate the generality and effectiveness of our proposed visual re-ranking method.
arXiv Detail & Related papers (2021-10-26T06:20:31Z) - Saliency-driven Class Impressions for Feature Visualization of Deep
Neural Networks [55.11806035788036]
It is advantageous to visualize the features considered to be essential for classification.
Existing visualization methods develop high confidence images consisting of both background and foreground features.
In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task.
arXiv Detail & Related papers (2020-07-31T06:11:06Z) - Attribute Mix: Semantic Data Augmentation for Fine Grained Recognition [102.45926816660665]
We propose Attribute Mix, a data augmentation strategy at attribute level to expand the fine-grained samples.
The principle lies in that attribute features are shared among fine-grained sub-categories, and can be seamlessly transferred among images.
arXiv Detail & Related papers (2020-04-06T14:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.