Multivariate Representation Learning for Information Retrieval
- URL: http://arxiv.org/abs/2304.14522v1
- Date: Thu, 27 Apr 2023 20:30:46 GMT
- Title: Multivariate Representation Learning for Information Retrieval
- Authors: Hamed Zamani and Michael Bendersky
- Abstract summary: We propose a new representation learning framework for dense retrieval.
Instead of learning a vector for each query and document, our framework learns a multivariate distribution.
We show that it can be seamlessly integrated into the existing approximate nearest neighbor algorithms.
- Score: 31.31440742912932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dense retrieval models use bi-encoder network architectures for learning
query and document representations. These representations are often in the form
of a vector representation and their similarities are often computed using the
dot product function. In this paper, we propose a new representation learning
framework for dense retrieval. Instead of learning a vector for each query and
document, our framework learns a multivariate distribution and uses negative
multivariate KL divergence to compute the similarity between distributions. For
simplicity and efficiency reasons, we assume that the distributions are
multivariate normals and then train large language models to produce mean and
variance vectors for these distributions. We provide a theoretical foundation
for the proposed framework and show that it can be seamlessly integrated into
the existing approximate nearest neighbor algorithms to perform retrieval
efficiently. We conduct an extensive suite of experiments on a wide range of
datasets, and demonstrate significant improvements compared to competitive
dense retrieval models.
Related papers
- Binary Code Similarity Detection via Graph Contrastive Learning on Intermediate Representations [52.34030226129628]
Binary Code Similarity Detection (BCSD) plays a crucial role in numerous fields, including vulnerability detection, malware analysis, and code reuse identification.
In this paper, we propose IRBinDiff, which mitigates compilation differences by leveraging LLVM-IR with higher-level semantic abstraction.
Our extensive experiments, conducted under varied compilation settings, demonstrate that IRBinDiff outperforms other leading BCSD methods in both One-to-one comparison and One-to-many search scenarios.
arXiv Detail & Related papers (2024-10-24T09:09:20Z) - Hierarchical Visual Categories Modeling: A Joint Representation Learning and Density Estimation Framework for Out-of-Distribution Detection [28.442470704073767]
This paper proposes a novel hierarchical visual category modeling scheme to separate out-of-distribution data from in-distribution data.
We conduct experiments on seven popular benchmarks, including CIFAR, iNaturalist, SUN, Places, Textures, ImageNet-O, and OpenImage-O.
Our visual representation has a competitive performance when compared with features learned by classical methods.
arXiv Detail & Related papers (2024-08-28T07:05:46Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only.
We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z) - An Upper Bound for the Distribution Overlap Index and Its Applications [18.481370450591317]
This paper proposes an easy-to-compute upper bound for the overlap index between two probability distributions.
The proposed bound shows its value in one-class classification and domain shift analysis.
Our work shows significant promise toward broadening the applications of overlap-based metrics.
arXiv Detail & Related papers (2022-12-16T20:02:03Z) - Invariant Causal Mechanisms through Distribution Matching [86.07327840293894]
In this work we provide a causal perspective and a new algorithm for learning invariant representations.
Empirically we show that this algorithm works well on a diverse set of tasks and in particular we observe state-of-the-art performance on domain generalization.
arXiv Detail & Related papers (2022-06-23T12:06:54Z) - Multimodal Adversarially Learned Inference with Factorized
Discriminators [10.818838437018682]
We propose a novel approach to generative modeling of multimodal data based on generative adversarial networks.
To learn a coherent multimodal generative model, we show that it is necessary to align different encoder distributions with the joint decoder distribution simultaneously.
By taking advantage of contrastive learning through factorizing the discriminator, we train our model on unimodal data.
arXiv Detail & Related papers (2021-12-20T08:18:49Z) - Multivariate Data Explanation by Jumping Emerging Patterns Visualization [78.6363825307044]
We present VAX (multiVariate dAta eXplanation), a new VA method to support the identification and visual interpretation of patterns in multivariate data sets.
Unlike the existing similar approaches, VAX uses the concept of Jumping Emerging Patterns to identify and aggregate several diversified patterns, producing explanations through logic combinations of data variables.
arXiv Detail & Related papers (2021-06-21T13:49:44Z) - pRSL: Interpretable Multi-label Stacking by Learning Probabilistic Rules [0.0]
We present the probabilistic rule stacking (pRSL) which uses probabilistic propositional logic rules and belief propagation to combine the predictions of several underlying classifiers.
We derive algorithms for exact and approximate inference and learning, and show that pRSL reaches state-of-the-art performance on various benchmark datasets.
arXiv Detail & Related papers (2021-05-28T14:06:21Z) - Orthogonal Multi-view Analysis by Successive Approximations via
Eigenvectors [7.870955752916424]
The framework integrates the correlations within multiple views, supervised discriminant capacity, and distance preservation.
It not only includes several existing models as special cases, but also inspires new novel models.
Experiments are conducted on various real-world datasets for multi-view discriminant analysis and multi-view multi-label classification.
arXiv Detail & Related papers (2020-10-04T17:16:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.