On the Replicability of Combining Word Embeddings and Retrieval Models
- URL: http://arxiv.org/abs/2001.04484v1
- Date: Mon, 13 Jan 2020 19:01:07 GMT
- Title: On the Replicability of Combining Word Embeddings and Retrieval Models
- Authors: Luca Papariello, Alexandros Bampoulidis, Mihai Lupu
- Abstract summary: We replicate recent experiments attempting to demonstrate an attractive hypothesis about the use of the Fisher kernel framework.
Specifically, the hypothesis was that the use of a mixture model of von Mises-Fisher (VMF) distributions would be beneficial because of the focus on cosine distances of both VMF and the vector space model.
- Score: 71.18271398274513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We replicate recent experiments attempting to demonstrate an attractive
hypothesis about the use of the Fisher kernel framework and mixture models for
aggregating word embeddings towards document representations and the use of
these representations in document classification, clustering, and retrieval.
Specifically, the hypothesis was that the use of a mixture model of von
Mises-Fisher (VMF) distributions instead of Gaussian distributions would be
beneficial because of the focus on cosine distances of both VMF and the vector
space model traditionally used in information retrieval. Previous experiments
had validated this hypothesis. Our replication was not able to validate it,
despite a large parameter scan space.
Related papers
- Structured Diffusion Models with Mixture of Gaussians as Prior Distribution [13.052085651071135]
We develop a simple-to-implement training procedure that smoothly accommodates the use of mixed Gaussian as prior.
Our method is shown to be robust to mis-specifications and in particular suits situations where training resources are limited or faster training in real time is desired.
arXiv Detail & Related papers (2024-10-24T20:34:06Z) - Quasi-Bayes meets Vines [2.3124143670964448]
We propose a different way to extend Quasi-Bayesian prediction to high dimensions through the use of Sklar's theorem.
We show that our proposed Quasi-Bayesian Vine (QB-Vine) is a fully non-parametric density estimator with emphan analytical form.
arXiv Detail & Related papers (2024-06-18T16:31:02Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Projection Regret: Reducing Background Bias for Novelty Detection via
Diffusion Models [72.07462371883501]
We propose emphProjection Regret (PR), an efficient novelty detection method that mitigates the bias of non-semantic information.
PR computes the perceptual distance between the test image and its diffusion-based projection to detect abnormality.
Extensive experiments demonstrate that PR outperforms the prior art of generative-model-based novelty detection methods by a significant margin.
arXiv Detail & Related papers (2023-12-05T09:44:47Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Prototype-based Aleatoric Uncertainty Quantification for Cross-modal
Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space.
However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts.
We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z) - Mixture of von Mises-Fisher distribution with sparse prototypes [0.0]
Mixtures of von Mises-Fisher distributions can be used to cluster data on the unit hypersphere.
We propose in this article to estimate a von Mises mixture using a l 1 penalized likelihood.
arXiv Detail & Related papers (2022-12-30T08:00:38Z) - Geometric Priors for Scientific Generative Models in Inertial
Confinement Fusion [32.1427322437781]
We develop a Wasserstein autoencoder (WAE) with a hyperspherical prior for multimodal data.
We exploit a known relationship between the modalities in the dataset as a scientific constraint, and study different properties of the proposed model.
arXiv Detail & Related papers (2021-11-24T21:06:36Z) - Variational Capsule Encoder [6.244396213953519]
We propose a novel capsule network based variational encoder architecture, called Bayesian capsules (B-Caps)
We hypothesized that this approach can learn a better representation of features in the latent space than traditional approaches.
Our results indicate the strength of capsule networks in representation learning which has never been examined under the VAE settings.
arXiv Detail & Related papers (2020-10-18T20:52:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.