Related papers: A Lower Bound of Hash Codes' Performance

A Lower Bound of Hash Codes' Performance

URL: http://arxiv.org/abs/2210.05899v1
Date: Wed, 12 Oct 2022 03:30:56 GMT
Title: A Lower Bound of Hash Codes' Performance
Authors: Xiaosu Zhu, Jingkuan Song, Yu Lei, Lianli Gao and Heng Tao Shen
Abstract summary: In this paper, we prove that inter-class distinctiveness and intra-class compactness among hash codes determine the lower bound of hash codes' performance. We then propose a surrogate model to fully exploit the above objective by estimating the posterior of hash codes and controlling it, which results in a low-bias optimization. By testing on a series of hash-models, we obtain performance improvements among all of them, with an up to $26.5%$ increase in mean Average Precision and an up to $20.5%$ increase in accuracy.
Score: 122.88252443695492
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As a crucial approach for compact representation learning, hashing has achieved great success in effectiveness and efficiency. Numerous heuristic Hamming space metric learning objectives are designed to obtain high-quality hash codes. Nevertheless, a theoretical analysis of criteria for learning good hash codes remains largely unexploited. In this paper, we prove that inter-class distinctiveness and intra-class compactness among hash codes determine the lower bound of hash codes' performance. Promoting these two characteristics could lift the bound and improve hash learning. We then propose a surrogate model to fully exploit the above objective by estimating the posterior of hash codes and controlling it, which results in a low-bias optimization. Extensive experiments reveal the effectiveness of the proposed method. By testing on a series of hash-models, we obtain performance improvements among all of them, with an up to $26.5\%$ increase in mean Average Precision and an up to $20.5\%$ increase in accuracy. Our code is publicly available at \url{https://github.com/VL-Group/LBHash}.

Related papers

SECRET: Towards Scalable and Efficient Code Retrieval via Segmented Deep Hashing [83.35231185111464]
Deep learning has shifted the retrieval paradigm from lexical-based matching to encode source code and queries into vector representations. Previous research proposes deep hashing-based methods, which generate hash codes for queries and code snippets and use Hamming distance for rapid recall of code candidates. We propose a novel approach, which converts long hash codes calculated by existing deep hashing approaches into several short hash code segments through an iterative training strategy.
arXiv Detail & Related papers (2024-12-16T12:51:35Z)
A Flexible Plug-and-Play Module for Generating Variable-Length [61.095479786194836]
Nested Hash Layer (NHL) is a plug-and-play module designed for existing deep supervised hashing models. NHL simultaneously generates hash codes of varying lengths in a nested manner. NHL achieves superior retrieval performance across various deep hashing models.
arXiv Detail & Related papers (2024-12-12T04:13:09Z)
Self-Distilled Hashing for Deep Image Retrieval [25.645550298697938]
In hash-based image retrieval systems, transformed input from the original usually generates different codes. We propose a novel self-distilled hashing scheme to minimize the discrepancy while exploiting the potential of augmented data. We also introduce hash proxy-based similarity learning and binary cross entropy-based quantization loss to provide fine quality hash codes.
arXiv Detail & Related papers (2021-12-16T12:01:50Z)
One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective [86.48094395282546]
A deep hashing model typically has two main learning objectives: to make the learned binary hash codes discriminative and to minimize a quantization error. We propose a novel deep hashing model with only a single learning objective. Our model is highly effective, outperforming the state-of-the-art multi-loss hashing models on three large-scale instance retrieval benchmarks.
arXiv Detail & Related papers (2021-09-29T14:27:51Z)
Unsupervised Multi-Index Semantic Hashing [23.169142004594434]
We propose an unsupervised hashing model that learns hash codes that are both effective and highly efficient by being optimized for multi-index hashing. We experimentally compare MISH to state-of-the-art semantic hashing baselines in the task of document similarity search. We find that even though multi-index hashing also improves the efficiency of the baselines compared to a linear scan, they are still upwards of 33% slower than MISH.
arXiv Detail & Related papers (2021-03-26T13:33:48Z)
CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON) First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z)
Deep Hashing with Hash-Consistent Large Margin Proxy Embeddings [65.36757931982469]
Image hash codes are produced by binarizing embeddings of convolutional neural networks (CNN) trained for either classification or retrieval. The use of a fixed set of proxies (weights of the CNN classification layer) is proposed to eliminate this ambiguity. The resulting hash-consistent large margin (HCLM) proxies are shown to encourage saturation of hashing units, thus guaranteeing a small binarization error.
arXiv Detail & Related papers (2020-07-27T23:47:43Z)
Procrustean Orthogonal Sparse Hashing [3.302605292858623]
We show that insect olfaction is structurally and functionally analogous to sparse hashing. We present a novel method, Procrustean Orthogonal Sparse Hashing (POSH), that unifies these findings. We propose two new methods, Binary OSL and SphericalHash, to address these deficiencies.
arXiv Detail & Related papers (2020-06-08T18:09:33Z)
Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval. We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing. This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z)
Reinforcing Short-Length Hashing [61.75883795807109]
Existing methods have poor performance in retrieval using an extremely short-length hash code. In this study, we propose a novel reinforcing short-length hashing (RSLH) In this proposed RSLH, mutual reconstruction between the hash representation and semantic labels is performed to preserve the semantic information. Experiments on three large-scale image benchmarks demonstrate the superior performance of RSLH under various short-length hashing scenarios.
arXiv Detail & Related papers (2020-04-24T02:23:52Z)
Image Hashing by Minimizing Discrete Component-wise Wasserstein Distance [12.968141477410597]
Adversarial autoencoders are shown to be able to implicitly learn a robust, locality-preserving hash function that generates balanced and high-quality hash codes. The existing adversarial hashing methods are inefficient to be employed for large-scale image retrieval applications. We propose a new adversarial-autoencoder hashing approach that has a much lower sample requirement and computational cost.
arXiv Detail & Related papers (2020-02-29T00:22:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.