Projected Hamming Dissimilarity for Bit-Level Importance Coding in
Collaborative Filtering
- URL: http://arxiv.org/abs/2103.14455v1
- Date: Fri, 26 Mar 2021 13:22:31 GMT
- Title: Projected Hamming Dissimilarity for Bit-Level Importance Coding in
Collaborative Filtering
- Authors: Christian Hansen, Casper Hansen, Jakob Grue Simonsen, Christina Lioma
- Abstract summary: We show a new way of measuring the dissimilarity between two objects in the Hamming space with binary weighting of each dimension.
We propose a variational hashing model for learning hash codes optimized for this projected Hamming dissimilarity.
The resultant hash codes lead to effectiveness gains of up to +7% in NDCG and +14% in MRR.
- Score: 21.563733343861713
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When reasoning about tasks that involve large amounts of data, a common
approach is to represent data items as objects in the Hamming space where
operations can be done efficiently and effectively. Object similarity can then
be computed by learning binary representations (hash codes) of the objects and
computing their Hamming distance. While this is highly efficient, each bit
dimension is equally weighted, which means that potentially discriminative
information of the data is lost. A more expressive alternative is to use
real-valued vector representations and compute their inner product; this allows
varying the weight of each dimension but is many magnitudes slower. To fix
this, we derive a new way of measuring the dissimilarity between two objects in
the Hamming space with binary weighting of each dimension (i.e., disabling
bits): we consider a field-agnostic dissimilarity that projects the vector of
one object onto the vector of the other. When working in the Hamming space,
this results in a novel projected Hamming dissimilarity, which by choice of
projection, effectively allows a binary importance weighting of the hash code
of one object through the hash code of the other. We propose a variational
hashing model for learning hash codes optimized for this projected Hamming
dissimilarity, and experimentally evaluate it in collaborative filtering
experiments. The resultant hash codes lead to effectiveness gains of up to +7%
in NDCG and +14% in MRR compared to state-of-the-art hashing-based
collaborative filtering baselines, while requiring no additional storage and no
computational overhead compared to using the Hamming distance.
Related papers
- Sparse-Inductive Generative Adversarial Hashing for Nearest Neighbor
Search [8.020530603813416]
We propose a novel unsupervised hashing method, termed Sparsity-Induced Generative Adversarial Hashing (SiGAH)
SiGAH encodes large-scale high-scale high-dimensional features into binary codes, which solves the two problems through a generative adversarial training framework.
Experimental results on four benchmarks, i.e. Tiny100K, GIST1M, Deep1M, and MNIST, have shown that the proposed SiGAH has superior performance over state-of-the-art approaches.
arXiv Detail & Related papers (2023-06-12T08:07:23Z) - Asymmetric Scalable Cross-modal Hashing [51.309905690367835]
Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue.
We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues.
Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-07-26T04:38:47Z) - Dimensionality Reduction for Categorical Data [0.9560980936110233]
We present FSketch to create sketches for sparse categorical data and an estimator to estimate the pairwise Hamming distances.
FSketch is significantly faster, and the accuracy obtained by using its sketches are among the top for the standard unsupervised tasks of RMSE, clustering and similarity search.
arXiv Detail & Related papers (2021-12-01T09:20:28Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - One Loss for All: Deep Hashing with a Single Cosine Similarity based
Learning Objective [86.48094395282546]
A deep hashing model typically has two main learning objectives: to make the learned binary hash codes discriminative and to minimize a quantization error.
We propose a novel deep hashing model with only a single learning objective.
Our model is highly effective, outperforming the state-of-the-art multi-loss hashing models on three large-scale instance retrieval benchmarks.
arXiv Detail & Related papers (2021-09-29T14:27:51Z) - Representation Learning for Efficient and Effective Similarity Search
and Recommendation [6.280255585012339]
This thesis makes contributions to representation learning that improve effectiveness of hash codes through more expressive representations and a more effective similarity measure.
The contributions are empirically validated on several tasks related to similarity search and recommendation.
arXiv Detail & Related papers (2021-09-04T08:19:01Z) - Learning Optical Flow from a Few Matches [67.83633948984954]
We show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it.
Experiments show that our method can reduce computational cost and memory use significantly, while maintaining high accuracy.
arXiv Detail & Related papers (2021-04-05T21:44:00Z) - CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON)
First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z) - Image Hashing by Minimizing Discrete Component-wise Wasserstein Distance [12.968141477410597]
Adversarial autoencoders are shown to be able to implicitly learn a robust, locality-preserving hash function that generates balanced and high-quality hash codes.
The existing adversarial hashing methods are inefficient to be employed for large-scale image retrieval applications.
We propose a new adversarial-autoencoder hashing approach that has a much lower sample requirement and computational cost.
arXiv Detail & Related papers (2020-02-29T00:22:53Z) - Auto-Encoding Twin-Bottleneck Hashing [141.5378966676885]
This paper proposes an efficient and adaptive code-driven graph.
It is updated by decoding in the context of an auto-encoder.
Experiments on benchmarked datasets clearly show the superiority of our framework over the state-of-the-art hashing methods.
arXiv Detail & Related papers (2020-02-27T05:58:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.