Related papers: Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing

Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing

URL: http://arxiv.org/abs/2007.08799v1
Date: Fri, 17 Jul 2020 07:47:10 GMT
Title: Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing
Authors: Ricardo \~Nanculef, Francisco Mena, Antonio Macaluso, Stefano Lodi, Claudio Sartori
Abstract summary: This paper investigates the robustness of hashing methods based on variational autoencoders to the lack of supervision. We propose a novel supervision method in which the model uses its label distribution predictions to implement the pairwise objective. Our experiments show that both methods can significantly increase the hash codes' quality.
Score: 1.8899300124593648
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic hashing is an emerging technique for large-scale similarity search based on representing high-dimensional data using similarity-preserving binary codes used for efficient indexing and search. It has recently been shown that variational autoencoders, with Bernoulli latent representations parametrized by neural nets, can be successfully trained to learn such codes in supervised and unsupervised scenarios, improving on more traditional methods thanks to their ability to handle the binary constraints architecturally. However, the scenario where labels are scarce has not been studied yet. This paper investigates the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. The first augments the variational autoencoder's training objective to jointly model the distribution over the data and the class labels. The second approach exploits the annotations to define an additional pairwise loss that enforces consistency between the similarity in the code (Hamming) space and the similarity in the label space. Our experiments show that both methods can significantly increase the hash codes' quality. The pairwise approach can exhibit an advantage when the number of labelled points is large. However, we found that this method degrades quickly and loses its advantage when labelled samples decrease. To circumvent this problem, we propose a novel supervision method in which the model uses its label distribution predictions to implement the pairwise objective. Compared to the best baseline, this procedure yields similar performance in fully supervised settings but improves the results significantly when labelled data is scarce. Our code is made publicly available at https://github.com/amacaluso/SSB-VAE.

Related papers

Distribution-Consistency-Guided Multi-modal Hashing [24.945074615208]
We propose a novel Distribution-Consistency-Guided Multi-modal Hashing (DCGMH) to enhance retrieval performance. The proposed method first randomly initializes several category centers, which are used to compute the high-low distribution of similarity scores. Extensive experiments on three widely used datasets demonstrate the superiority of the proposed method compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-12-15T15:13:14Z)
Supervised Auto-Encoding Twin-Bottleneck Hashing [5.653113092257149]
Auto-encoding Twin-bottleneck Hashing is one such method that dynamically builds the graph. In this work, we generalize the original model into a supervised deep hashing network by incorporating the label information.
arXiv Detail & Related papers (2023-06-19T18:50:02Z)
Unsupervised Hashing with Similarity Distribution Calibration [127.34239817201549]
Unsupervised hashing methods aim to preserve the similarity between data points in a feature space by mapping them to binary hash codes. These methods often overlook the fact that the similarity between data points in the continuous feature space may not be preserved in the discrete hash code space. The similarity range is bounded by the code length and can lead to a problem known as similarity collapse. This paper introduces a novel Similarity Distribution (SDC) method to alleviate this problem.
arXiv Detail & Related papers (2023-02-15T14:06:39Z)
SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification [24.386165255835063]
A common classification task situation is where one has a large amount of data available for training, but only a small portion is with class labels. The goal of semi-supervised training, in this context, is to improve classification accuracy by leverage information from a large amount of unlabeled data. We propose a novel unsupervised objective that focuses on the less studied relationship between the high confidence unlabeled data that are similar to each other. Our proposed SimPLE algorithm shows significant performance gains over previous algorithms on CIFAR-100 and Mini-ImageNet, and is on par with the state-of-the-art methods
arXiv Detail & Related papers (2021-03-30T23:48:06Z)
Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search [90.30623718137244]
We propose a novel deep hashing method for scalable multi-label image search. A new rank-consistency objective is applied to align the similarity orders from two spaces. A powerful loss function is designed to penalize the samples whose semantic similarity and hamming distance are mismatched.
arXiv Detail & Related papers (2021-02-02T13:46:58Z)
CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON) First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z)
Generative Semantic Hashing Enhanced via Boltzmann Machines [61.688380278649056]
Existing generative-hashing methods mostly assume a factorized form for the posterior distribution. We propose to employ the distribution of Boltzmann machine as the retrievalal posterior. We show that by effectively modeling correlations among different bits within a hash code, our model can achieve significant performance gains.
arXiv Detail & Related papers (2020-06-16T01:23:39Z)
Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval. We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing. This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z)
Auto-Encoding Twin-Bottleneck Hashing [141.5378966676885]
This paper proposes an efficient and adaptive code-driven graph. It is updated by decoding in the context of an auto-encoder. Experiments on benchmarked datasets clearly show the superiority of our framework over the state-of-the-art hashing methods.
arXiv Detail & Related papers (2020-02-27T05:58:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.