Related papers: MA-DPR: Manifold-aware Distance Metrics for Dense Passage Retrieval

MA-DPR: Manifold-aware Distance Metrics for Dense Passage Retrieval

URL: http://arxiv.org/abs/2509.13562v1
Date: Tue, 16 Sep 2025 22:02:56 GMT
Title: MA-DPR: Manifold-aware Distance Metrics for Dense Passage Retrieval
Authors: Yifan Liu, Qianfeng Wen, Mark Zhao, Jiazhou Liang, Scott Sanner,
Abstract summary: manifold-aware distance metric for DPR (MA-DPR)<n>We show that MA-DPR outperforms Euclidean and cosine distances by up to 26% on OOD passage retrieval.
Score: 21.576774075150123
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Dense Passage Retrieval (DPR) typically relies on Euclidean or cosine distance to measure query-passage relevance in embedding space, which is effective when embeddings lie on a linear manifold. However, our experiments across DPR benchmarks suggest that embeddings often lie on lower-dimensional, non-linear manifolds, especially in out-of-distribution (OOD) settings, where cosine and Euclidean distance fail to capture semantic similarity. To address this limitation, we propose a manifold-aware distance metric for DPR (MA-DPR) that models the intrinsic manifold structure of passages using a nearest neighbor graph and measures query-passage distance based on their shortest path in this graph. We show that MA-DPR outperforms Euclidean and cosine distances by up to 26% on OOD passage retrieval with comparable in-distribution performance across various embedding models while incurring a minimal increase in query inference time. Empirical evidence suggests that manifold-aware distance allows DPR to leverage context from related neighboring passages, making it effective even in the absence of direct semantic overlap. MADPR can be applied to a wide range of dense embedding and retrieval tasks, offering potential benefits across a wide spectrum of domains.

Related papers

Efficient Thought Space Exploration through Strategic Intervention [54.35208611253168]
We propose a novel Hint-Practice Reasoning (HPR) framework that operationalizes this insight through two synergistic components.<n>The framework's core innovation lies in Distributional Inconsistency Reduction (DIR), which dynamically identifies intervention points.<n> Experiments across arithmetic and commonsense reasoning benchmarks demonstrate HPR's state-of-the-art efficiency-accuracy tradeoffs.
arXiv Detail & Related papers (2025-11-13T07:26:01Z)
Radial Neighborhood Smoothing Recommender System [0.0]
Radial Neighborhood Estimator (RNE) is proposed to construct neighborhoods based on overlapped and partially overlapped user-item pairs.<n>RNE achieves superior performance compared to existing collaborative filtering and matrix factorization methods.
arXiv Detail & Related papers (2025-07-14T06:01:58Z)
GeoMM: On Geodesic Perspective for Multi-modal Learning [55.41612200877861]
This paper introduces geodesic distance as a novel distance metric in multi-modal learning for the first time.<n>Our approach incorporates a comprehensive series of strategies to adapt geodesic distance for the current multimodal learning.
arXiv Detail & Related papers (2025-05-16T13:12:41Z)
MPAD: A New Dimension-Reduction Method for Preserving Nearest Neighbors in High-Dimensional Vector Search [1.1701842638497677]
dimensionality reduction (DR) is seldom applied due to its tendency to distort nearest-neighbor structure critical for search.<n>We present MPAD: Maximum Pairwise Absolute Difference, an unsupervised DR method that explicitly preserves approximate NN relations.<n> experiments across multiple domains show that MPAD consistently outperforms standard DR methods in preserving neighborhood structure.
arXiv Detail & Related papers (2025-04-23T00:59:00Z)
Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold. We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z)
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation [72.71398034617607]
KERPLE is a framework that generalizes relative position embedding for extrapolation by kernelizing positional differences. The diversity of CPD kernels allows us to derive various RPEs that enable length extrapolation in a principled way.
arXiv Detail & Related papers (2022-05-20T01:25:57Z)
Out-of-distribution Detection with Deep Nearest Neighbors [33.71627349163909]
Out-of-distribution (OOD) detection is a critical task for deploying machine learning models in the open world. In this paper, we explore the efficacy of non-parametric nearest-neighbor distance for OOD detection. We demonstrate the effectiveness of nearest-neighbor-based OOD detection on several benchmarks and establish superior performance.
arXiv Detail & Related papers (2022-04-13T16:45:21Z)
Cycle Consistent Probability Divergences Across Different Spaces [38.43511529063335]
Discrepancy measures between probability distributions are at the core of statistical inference and machine learning. This work proposes a novel unbalanced Monge optimal transport formulation for matching, up to isometries, distributions on different spaces.
arXiv Detail & Related papers (2021-11-22T16:35:58Z)
Kernel distance measures for time series, random fields and other structured data [71.61147615789537]
kdiff is a novel kernel-based measure for estimating distances between instances of structured data. It accounts for both self and cross similarities across the instances and is defined using a lower quantile of the distance distribution. Some theoretical results are provided for separability conditions using kdiff as a distance measure for clustering and classification problems.
arXiv Detail & Related papers (2021-09-29T22:54:17Z)
Diffusion Earth Mover's Distance and Distribution Embeddings [61.49248071384122]
Diffusion can be computed in $tildeO(n)$ time and is more accurate than similarly fast algorithms such as tree-baseds. We show Diffusion is fully differentiable, making it amenable to future uses in gradient-descent frameworks such as deep neural networks.
arXiv Detail & Related papers (2021-02-25T13:18:32Z)
On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected. Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances. Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z)
Theoretical Guarantees for Bridging Metric Measure Embedding and Optimal Transport [18.61019008000831]
We consider a method allowing to embed the metric measure spaces in a common Euclidean space and compute an optimal transport (OT) on the embedded distributions. This leads to what we call a sub-embedding robust Wasserstein (SERW) distance.
arXiv Detail & Related papers (2020-02-19T17:52:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.