Related papers: Bio-Inspired Hashing for Unsupervised Similarity Search

Bio-Inspired Hashing for Unsupervised Similarity Search

URL: http://arxiv.org/abs/2001.04907v2
Date: Tue, 30 Jun 2020 17:29:56 GMT
Title: Bio-Inspired Hashing for Unsupervised Similarity Search
Authors: Chaitanya K. Ryali, John J. Hopfield, Leopold Grinberg, Dmitry Krotov
Abstract summary: We propose a novel hashing algorithm BioHash that produces sparse high dimensional hash codes in a data-driven manner. Our work provides evidence for the proposal that LSH might be a computational reason for the abundance of sparse expansive motifs in a variety of biological systems.
Score: 12.78093617645299
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The fruit fly Drosophila's olfactory circuit has inspired a new locality sensitive hashing (LSH) algorithm, FlyHash. In contrast with classical LSH algorithms that produce low dimensional hash codes, FlyHash produces sparse high-dimensional hash codes and has also been shown to have superior empirical performance compared to classical LSH algorithms in similarity search. However, FlyHash uses random projections and cannot learn from data. Building on inspiration from FlyHash and the ubiquity of sparse expansive representations in neurobiology, our work proposes a novel hashing algorithm BioHash that produces sparse high dimensional hash codes in a data-driven manner. We show that BioHash outperforms previously published benchmarks for various hashing methods. Since our learning algorithm is based on a local and biologically plausible synaptic plasticity rule, our work provides evidence for the proposal that LSH might be a computational reason for the abundance of sparse expansive motifs in a variety of biological systems. We also propose a convolutional variant BioConvHash that further improves performance. From the perspective of computer science, BioHash and BioConvHash are fast, scalable and yield compressed binary representations that are useful for similarity search.

Related papers

SECRET: Towards Scalable and Efficient Code Retrieval via Segmented Deep Hashing [83.35231185111464]
Deep learning has shifted the retrieval paradigm from lexical-based matching to encode source code and queries into vector representations. Previous research proposes deep hashing-based methods, which generate hash codes for queries and code snippets and use Hamming distance for rapid recall of code candidates. We propose a novel approach, which converts long hash codes calculated by existing deep hashing approaches into several short hash code segments through an iterative training strategy.
arXiv Detail & Related papers (2024-12-16T12:51:35Z)
A Flexible Plug-and-Play Module for Generating Variable-Length [61.095479786194836]
Nested Hash Layer (NHL) is a plug-and-play module designed for existing deep supervised hashing models. NHL simultaneously generates hash codes of varying lengths in a nested manner. NHL achieves superior retrieval performance across various deep hashing models.
arXiv Detail & Related papers (2024-12-12T04:13:09Z)
ConceptHash: Interpretable Fine-Grained Hashing via Concept Discovery [128.30514851911218]
ConceptHash is a novel method that achieves sub-code level interpretability. In ConceptHash, each sub-code corresponds to a human-understandable concept, such as an object part. We incorporate language guidance to ensure that the learned hash codes are distinguishable within fine-grained object classes.
arXiv Detail & Related papers (2024-06-12T17:49:26Z)
Differentially Private One Permutation Hashing and Bin-wise Consistent Weighted Sampling [37.6593006747285]
Minwise hashing (MinHash) is a standard algorithm widely used in the industry for large-scale search and learning applications. One permutation hashing (OPH) is an efficient alternative of MinHash which splits the data vectors into $K$ bins and generates hash values within each bin. We propose DP-OPH framework with three variants: DP-OPH-fix, DP-OPH-re and DP-OPH-rand, depending on which densification strategy is adopted to deal with empty bins in OPH.
arXiv Detail & Related papers (2023-06-13T10:38:12Z)
A Lower Bound of Hash Codes' Performance [122.88252443695492]
In this paper, we prove that inter-class distinctiveness and intra-class compactness among hash codes determine the lower bound of hash codes' performance. We then propose a surrogate model to fully exploit the above objective by estimating the posterior of hash codes and controlling it, which results in a low-bias optimization. By testing on a series of hash-models, we obtain performance improvements among all of them, with an up to $26.5%$ increase in mean Average Precision and an up to $20.5%$ increase in accuracy.
arXiv Detail & Related papers (2022-10-12T03:30:56Z)
Learning to Hash Naturally Sorts [84.90210592082829]
We introduce Naturally-Sorted Hashing (NSH) to train a deep hashing model with sorted results end-to-end. NSH sort the Hamming distances of samples' hash codes and accordingly gather their latent representations for self-supervised training. We describe a novel Sorted Noise-Contrastive Estimation (SortedNCE) loss that selectively picks positive and negative samples for contrastive learning.
arXiv Detail & Related papers (2022-01-31T16:19:02Z)
Self-Distilled Hashing for Deep Image Retrieval [25.645550298697938]
In hash-based image retrieval systems, transformed input from the original usually generates different codes. We propose a novel self-distilled hashing scheme to minimize the discrepancy while exploiting the potential of augmented data. We also introduce hash proxy-based similarity learning and binary cross entropy-based quantization loss to provide fine quality hash codes.
arXiv Detail & Related papers (2021-12-16T12:01:50Z)
C-MinHash: Rigorously Reducing $K$ Permutations to Two [25.356048456005023]
Minwise hashing (MinHash) is an important and practical algorithm for generating random hashes to approximate the Jaccard (resemblance) similarity in massive binary (0/1) data. We propose bf Circulant MinHash (C-MinHash) and provide the surprising theoretical results that we just need textbftwo independent random permutations.
arXiv Detail & Related papers (2021-09-07T21:06:33Z)
Unsupervised Multi-Index Semantic Hashing [23.169142004594434]
We propose an unsupervised hashing model that learns hash codes that are both effective and highly efficient by being optimized for multi-index hashing. We experimentally compare MISH to state-of-the-art semantic hashing baselines in the task of document similarity search. We find that even though multi-index hashing also improves the efficiency of the baselines compared to a linear scan, they are still upwards of 33% slower than MISH.
arXiv Detail & Related papers (2021-03-26T13:33:48Z)
CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON) First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z)
Deep Reinforcement Learning with Label Embedding Reward for Supervised Image Hashing [85.84690941656528]
We introduce a novel decision-making approach for deep supervised hashing. We learn a deep Q-network with a novel label embedding reward defined by Bose-Chaudhuri-Hocquenghem codes. Our approach outperforms state-of-the-art supervised hashing methods under various code lengths.
arXiv Detail & Related papers (2020-08-10T09:17:20Z)
Procrustean Orthogonal Sparse Hashing [3.302605292858623]
We show that insect olfaction is structurally and functionally analogous to sparse hashing. We present a novel method, Procrustean Orthogonal Sparse Hashing (POSH), that unifies these findings. We propose two new methods, Binary OSL and SphericalHash, to address these deficiencies.
arXiv Detail & Related papers (2020-06-08T18:09:33Z)
Reinforcing Short-Length Hashing [61.75883795807109]
Existing methods have poor performance in retrieval using an extremely short-length hash code. In this study, we propose a novel reinforcing short-length hashing (RSLH) In this proposed RSLH, mutual reconstruction between the hash representation and semantic labels is performed to preserve the semantic information. Experiments on three large-scale image benchmarks demonstrate the superior performance of RSLH under various short-length hashing scenarios.
arXiv Detail & Related papers (2020-04-24T02:23:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.