Unsupervised Hashing with Similarity Distribution Calibration
- URL: http://arxiv.org/abs/2302.07669v2
- Date: Thu, 31 Aug 2023 11:24:15 GMT
- Title: Unsupervised Hashing with Similarity Distribution Calibration
- Authors: Kam Woh Ng, Xiatian Zhu, Jiun Tian Hoe, Chee Seng Chan, Tianyu Zhang,
Yi-Zhe Song, Tao Xiang
- Abstract summary: Unsupervised hashing methods aim to preserve the similarity between data points in a feature space by mapping them to binary hash codes.
These methods often overlook the fact that the similarity between data points in the continuous feature space may not be preserved in the discrete hash code space.
The similarity range is bounded by the code length and can lead to a problem known as similarity collapse.
This paper introduces a novel Similarity Distribution (SDC) method to alleviate this problem.
- Score: 127.34239817201549
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Unsupervised hashing methods typically aim to preserve the similarity between
data points in a feature space by mapping them to binary hash codes. However,
these methods often overlook the fact that the similarity between data points
in the continuous feature space may not be preserved in the discrete hash code
space, due to the limited similarity range of hash codes. The similarity range
is bounded by the code length and can lead to a problem known as similarity
collapse. That is, the positive and negative pairs of data points become less
distinguishable from each other in the hash space. To alleviate this problem,
in this paper a novel Similarity Distribution Calibration (SDC) method is
introduced. SDC aligns the hash code similarity distribution towards a
calibration distribution (e.g., beta distribution) with sufficient spread
across the entire similarity range, thus alleviating the similarity collapse
problem. Extensive experiments show that our SDC outperforms significantly the
state-of-the-art alternatives on coarse category-level and instance-level image
retrieval. Code is available at https://github.com/kamwoh/sdc.
Related papers
- Cluster-Aware Similarity Diffusion for Instance Retrieval [64.40171728912702]
Diffusion-based re-ranking is a common method used for retrieving instances by performing similarity propagation in a nearest neighbor graph.
We propose a novel Cluster-Aware Similarity (CAS) diffusion for instance retrieval.
arXiv Detail & Related papers (2024-06-04T14:19:50Z) - Binary Representation via Jointly Personalized Sparse Hashing [22.296464665032588]
We propose an effective unsupervised method, namely Jointly Personalized Sparse Hashing (JPSH) for binary representation learning.
Different personalized subspaces are constructed to reflect category-specific attributes for different clusters.
To simultaneously preserve semantic and pairwise similarities in our JPSH, we incorporate the PSH and manifold-based hash learning into the seamless formulation.
arXiv Detail & Related papers (2022-08-31T14:18:37Z) - Deep Unsupervised Hashing by Distilled Smooth Guidance [13.101031440853843]
We propose a novel deep unsupervised hashing method, namely Distilled Smooth Guidance (DSG)
To be specific, we obtain the similarity confidence weights based on the initial noisy similarity signals learned from local structures.
Extensive experiments on three widely used benchmark datasets show that the proposed DSG consistently outperforms the state-of-the-art search methods.
arXiv Detail & Related papers (2021-05-13T07:59:57Z) - Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search [90.30623718137244]
We propose a novel deep hashing method for scalable multi-label image search.
A new rank-consistency objective is applied to align the similarity orders from two spaces.
A powerful loss function is designed to penalize the samples whose semantic similarity and hamming distance are mismatched.
arXiv Detail & Related papers (2021-02-02T13:46:58Z) - CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON)
First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z) - Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing [1.8899300124593648]
This paper investigates the robustness of hashing methods based on variational autoencoders to the lack of supervision.
We propose a novel supervision method in which the model uses its label distribution predictions to implement the pairwise objective.
Our experiments show that both methods can significantly increase the hash codes' quality.
arXiv Detail & Related papers (2020-07-17T07:47:10Z) - Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and
Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval.
We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing.
This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z) - Reinforcing Short-Length Hashing [61.75883795807109]
Existing methods have poor performance in retrieval using an extremely short-length hash code.
In this study, we propose a novel reinforcing short-length hashing (RSLH)
In this proposed RSLH, mutual reconstruction between the hash representation and semantic labels is performed to preserve the semantic information.
Experiments on three large-scale image benchmarks demonstrate the superior performance of RSLH under various short-length hashing scenarios.
arXiv Detail & Related papers (2020-04-24T02:23:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.