PHPQ: Pyramid Hybrid Pooling Quantization for Efficient Fine-Grained
Image Retrieval
- URL: http://arxiv.org/abs/2109.05206v2
- Date: Tue, 9 Jan 2024 17:56:40 GMT
- Title: PHPQ: Pyramid Hybrid Pooling Quantization for Efficient Fine-Grained
Image Retrieval
- Authors: Ziyun Zeng, Jinpeng Wang, Bin Chen, Tao Dai, Shu-Tao Xia, Zhi Wang
- Abstract summary: We propose a Pyramid Hybrid Pooling Quantization (PHPQ) module to capture and preserve fine-grained semantic information from multi-level features.
Experiments on two widely-used public benchmarks, CUB-200-2011 and Stanford Dogs, demonstrate that PHPQ outperforms state-of-the-art methods.
- Score: 68.05570413133462
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep hashing approaches, including deep quantization and deep binary hashing,
have become a common solution to large-scale image retrieval due to their high
computation and storage efficiency. Most existing hashing methods cannot
produce satisfactory results for fine-grained retrieval, because they usually
adopt the outputs of the last CNN layer to generate binary codes. Since deeper
layers tend to summarize visual clues, e.g., texture, into abstract semantics,
e.g., dogs and cats, the feature produced by the last CNN layer is less
effective in capturing subtle but discriminative visual details that mostly
exist in shallow layers. To improve fine-grained image hashing, we propose
Pyramid Hybrid Pooling Quantization (PHPQ). Specifically, we propose a Pyramid
Hybrid Pooling (PHP) module to capture and preserve fine-grained semantic
information from multi-level features, which emphasizes the subtle
discrimination of different sub-categories. Besides, we propose a learnable
quantization module with a partial codebook attention mechanism, which helps to
optimize the most relevant codewords and improves the quantization.
Comprehensive experiments on two widely-used public benchmarks, i.e.,
CUB-200-2011 and Stanford Dogs, demonstrate that PHPQ outperforms
state-of-the-art methods.
Related papers
- Weakly Supervised Deep Hyperspherical Quantization for Image Retrieval [76.4407063566172]
We propose Weakly-Supervised Deep Hyperspherical Quantization (WSDHQ), which is the first work to learn deep quantization from weakly tagged images.
Specifically, 1) we use word embeddings to represent the tags and enhance their semantic information based on a tag correlation graph.
We jointly learn semantics-preserving embeddings and supervised quantizer on hypersphere by employing a well-designed fusion layer and tailor-made loss functions.
arXiv Detail & Related papers (2024-04-07T15:48:33Z) - Cascading Hierarchical Networks with Multi-task Balanced Loss for
Fine-grained hashing [1.6244541005112747]
Fine-grained hashing is more challenging than traditional hashing problems.
We propose a cascaded network to learn compact and highly semantic hash codes.
We also propose a novel approach to coordinately balance the loss of multi-task learning.
arXiv Detail & Related papers (2023-03-20T17:08:48Z) - A Lower Bound of Hash Codes' Performance [122.88252443695492]
In this paper, we prove that inter-class distinctiveness and intra-class compactness among hash codes determine the lower bound of hash codes' performance.
We then propose a surrogate model to fully exploit the above objective by estimating the posterior of hash codes and controlling it, which results in a low-bias optimization.
By testing on a series of hash-models, we obtain performance improvements among all of them, with an up to $26.5%$ increase in mean Average Precision and an up to $20.5%$ increase in accuracy.
arXiv Detail & Related papers (2022-10-12T03:30:56Z) - TransHash: Transformer-based Hamming Hashing for Efficient Image
Retrieval [0.0]
We present textbfTranshash, a pure transformer-based framework for deep hashing learning.
We achieve 8.2%, 2.6%, 12.7% performance gains in terms of average textitmAP for different hash bit lengths on three public datasets.
arXiv Detail & Related papers (2021-05-05T01:35:53Z) - Deep Reinforcement Learning with Label Embedding Reward for Supervised
Image Hashing [85.84690941656528]
We introduce a novel decision-making approach for deep supervised hashing.
We learn a deep Q-network with a novel label embedding reward defined by Bose-Chaudhuri-Hocquenghem codes.
Our approach outperforms state-of-the-art supervised hashing methods under various code lengths.
arXiv Detail & Related papers (2020-08-10T09:17:20Z) - Deep Hashing with Hash-Consistent Large Margin Proxy Embeddings [65.36757931982469]
Image hash codes are produced by binarizing embeddings of convolutional neural networks (CNN) trained for either classification or retrieval.
The use of a fixed set of proxies (weights of the CNN classification layer) is proposed to eliminate this ambiguity.
The resulting hash-consistent large margin (HCLM) proxies are shown to encourage saturation of hashing units, thus guaranteeing a small binarization error.
arXiv Detail & Related papers (2020-07-27T23:47:43Z) - A survey on deep hashing for image retrieval [7.156209824590489]
I propose a Shadow Recurrent Hashing(SRH) method as a try to break through the bottleneck of existing hashing methods.
Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close.
Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.
arXiv Detail & Related papers (2020-06-10T03:01:59Z) - Dual-level Semantic Transfer Deep Hashing for Efficient Social Image
Retrieval [35.78137004253608]
Social network stores and disseminates a tremendous amount of user shared images.
Deep hashing is an efficient indexing technique to support large-scale social image retrieval.
Existing methods suffer from severe semantic shortage when optimizing a large amount of deep neural network parameters.
We propose a Dual-level Semantic Transfer Deep Hashing (DSTDH) method to alleviate this problem.
arXiv Detail & Related papers (2020-06-10T01:03:09Z) - Auto-Encoding Twin-Bottleneck Hashing [141.5378966676885]
This paper proposes an efficient and adaptive code-driven graph.
It is updated by decoding in the context of an auto-encoder.
Experiments on benchmarked datasets clearly show the superiority of our framework over the state-of-the-art hashing methods.
arXiv Detail & Related papers (2020-02-27T05:58:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.