Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing
- URL: http://arxiv.org/abs/2007.08799v1
- Date: Fri, 17 Jul 2020 07:47:10 GMT
- Title: Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing
- Authors: Ricardo \~Nanculef, Francisco Mena, Antonio Macaluso, Stefano Lodi,
Claudio Sartori
- Abstract summary: This paper investigates the robustness of hashing methods based on variational autoencoders to the lack of supervision.
We propose a novel supervision method in which the model uses its label distribution predictions to implement the pairwise objective.
Our experiments show that both methods can significantly increase the hash codes' quality.
- Score: 1.8899300124593648
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic hashing is an emerging technique for large-scale similarity search
based on representing high-dimensional data using similarity-preserving binary
codes used for efficient indexing and search. It has recently been shown that
variational autoencoders, with Bernoulli latent representations parametrized by
neural nets, can be successfully trained to learn such codes in supervised and
unsupervised scenarios, improving on more traditional methods thanks to their
ability to handle the binary constraints architecturally. However, the scenario
where labels are scarce has not been studied yet.
This paper investigates the robustness of hashing methods based on
variational autoencoders to the lack of supervision, focusing on two
semi-supervised approaches currently in use. The first augments the variational
autoencoder's training objective to jointly model the distribution over the
data and the class labels. The second approach exploits the annotations to
define an additional pairwise loss that enforces consistency between the
similarity in the code (Hamming) space and the similarity in the label space.
Our experiments show that both methods can significantly increase the hash
codes' quality. The pairwise approach can exhibit an advantage when the number
of labelled points is large. However, we found that this method degrades
quickly and loses its advantage when labelled samples decrease. To circumvent
this problem, we propose a novel supervision method in which the model uses its
label distribution predictions to implement the pairwise objective. Compared to
the best baseline, this procedure yields similar performance in fully
supervised settings but improves the results significantly when labelled data
is scarce. Our code is made publicly available at
https://github.com/amacaluso/SSB-VAE.
Related papers
- Supervised Auto-Encoding Twin-Bottleneck Hashing [5.653113092257149]
Auto-encoding Twin-bottleneck Hashing is one such method that dynamically builds the graph.
In this work, we generalize the original model into a supervised deep hashing network by incorporating the label information.
arXiv Detail & Related papers (2023-06-19T18:50:02Z) - Unsupervised Hashing with Similarity Distribution Calibration [127.34239817201549]
Unsupervised hashing methods aim to preserve the similarity between data points in a feature space by mapping them to binary hash codes.
These methods often overlook the fact that the similarity between data points in the continuous feature space may not be preserved in the discrete hash code space.
The similarity range is bounded by the code length and can lead to a problem known as similarity collapse.
This paper introduces a novel Similarity Distribution (SDC) method to alleviate this problem.
arXiv Detail & Related papers (2023-02-15T14:06:39Z) - SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised
Classification [24.386165255835063]
A common classification task situation is where one has a large amount of data available for training, but only a small portion is with class labels.
The goal of semi-supervised training, in this context, is to improve classification accuracy by leverage information from a large amount of unlabeled data.
We propose a novel unsupervised objective that focuses on the less studied relationship between the high confidence unlabeled data that are similar to each other.
Our proposed SimPLE algorithm shows significant performance gains over previous algorithms on CIFAR-100 and Mini-ImageNet, and is on par with the state-of-the-art methods
arXiv Detail & Related papers (2021-03-30T23:48:06Z) - Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search [90.30623718137244]
We propose a novel deep hashing method for scalable multi-label image search.
A new rank-consistency objective is applied to align the similarity orders from two spaces.
A powerful loss function is designed to penalize the samples whose semantic similarity and hamming distance are mismatched.
arXiv Detail & Related papers (2021-02-02T13:46:58Z) - CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON)
First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z) - Generative Semantic Hashing Enhanced via Boltzmann Machines [61.688380278649056]
Existing generative-hashing methods mostly assume a factorized form for the posterior distribution.
We propose to employ the distribution of Boltzmann machine as the retrievalal posterior.
We show that by effectively modeling correlations among different bits within a hash code, our model can achieve significant performance gains.
arXiv Detail & Related papers (2020-06-16T01:23:39Z) - Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and
Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval.
We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing.
This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z) - Auto-Encoding Twin-Bottleneck Hashing [141.5378966676885]
This paper proposes an efficient and adaptive code-driven graph.
It is updated by decoding in the context of an auto-encoder.
Experiments on benchmarked datasets clearly show the superiority of our framework over the state-of-the-art hashing methods.
arXiv Detail & Related papers (2020-02-27T05:58:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.