Learning Distances from Data with Normalizing Flows and Score Matching
- URL: http://arxiv.org/abs/2407.09297v1
- Date: Fri, 12 Jul 2024 14:30:41 GMT
- Title: Learning Distances from Data with Normalizing Flows and Score Matching
- Authors: Peter Sorrenson, Daniel Behrend-Uriarte, Christoph Schnörr, Ullrich Köthe,
- Abstract summary: Density-based distances offer an elegant solution to the problem of metric learning.
We show that existing methods to estimate Fermat distances suffer from poor convergence in both low and high dimensions.
Our work paves the way for practical use of density-based distances, especially in high-dimensional spaces.
- Score: 9.605001452209867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Density-based distances (DBDs) offer an elegant solution to the problem of metric learning. By defining a Riemannian metric which increases with decreasing probability density, shortest paths naturally follow the data manifold and points are clustered according to the modes of the data. We show that existing methods to estimate Fermat distances, a particular choice of DBD, suffer from poor convergence in both low and high dimensions due to i) inaccurate density estimates and ii) reliance on graph-based paths which are increasingly rough in high dimensions. To address these issues, we propose learning the densities using a normalizing flow, a generative model with tractable density estimation, and employing a smooth relaxation method using a score model initialized from a graph-based proposal. Additionally, we introduce a dimension-adapted Fermat distance that exhibits more intuitive behavior when scaled to high dimensions and offers better numerical properties. Our work paves the way for practical use of density-based distances, especially in high-dimensional spaces.
Related papers
- Follow the Energy, Find the Path: Riemannian Metrics from Energy-Based Models [35.5088111343308]
We propose a method for deriving Riemannian metrics directly from pretrained Energy-Based Models.<n>These metrics define spatially varying distances, enabling the computation of geodesics.<n>We show that EBM-derived metrics consistently outperform established baselines.
arXiv Detail & Related papers (2025-05-23T12:18:08Z) - Probability Density Geodesics in Image Diffusion Latent Space [57.99700072218375]
We show that geodesic diffusions can be computed in latent space.<n>We analyze how closely video clips approximate geodesics in a pre-trained image diffusion space.
arXiv Detail & Related papers (2025-04-09T08:28:53Z) - Dimension reduction and the gradient flow of relative entropy [0.0]
Dimension reduction, widely used in science, maps high-dimensional data into low-dimensional space.
We investigate a basic mathematical model underlying the techniques of neighborhood embedding (SNE) and its popular variant t-SNE.
The aim is to map these points to low dimensions in an optimal way so that similar points are closer together.
arXiv Detail & Related papers (2024-09-25T14:23:04Z) - Graph Laplacian-based Bayesian Multi-fidelity Modeling [1.383698759122035]
A graph Laplacian constructed from the low-fidelity data is used to define a multivariate Gaussian prior density.
Few high-fidelity data points are used to construct a conjugate likelihood term.
The results demonstrate that by utilizing a small fraction of high-fidelity data, the multi-fidelity approach can significantly improve the accuracy of a large collection of low-fidelity data points.
arXiv Detail & Related papers (2024-09-12T16:51:55Z) - A Bayesian Approach Toward Robust Multidimensional Ellipsoid-Specific Fitting [0.0]
This work presents a novel and effective method for fitting multidimensional ellipsoids to scattered data in the contamination of noise and outliers.
We incorporate a uniform prior distribution to constrain the search for primitive parameters within an ellipsoidal domain.
We apply it to a wide range of practical applications such as microscopy cell counting, 3D reconstruction, geometric shape approximation, and magnetometer calibration tasks.
arXiv Detail & Related papers (2024-07-27T14:31:51Z) - Density Estimation via Binless Multidimensional Integration [45.21975243399607]
We introduce the Binless Multidimensional Thermodynamic Integration (BMTI) method for nonparametric, robust, and data-efficient density estimation.
BMTI estimates the logarithm of the density by initially computing log-density differences between neighbouring data points.
The method is tested on a variety of complex synthetic high-dimensional datasets, and is benchmarked on realistic datasets from the chemical physics literature.
arXiv Detail & Related papers (2024-07-10T23:45:20Z) - SPARE: Symmetrized Point-to-Plane Distance for Robust Non-Rigid Registration [76.40993825836222]
We propose SPARE, a novel formulation that utilizes a symmetrized point-to-plane distance for robust non-rigid registration.
The proposed method greatly improves the accuracy of non-rigid registration problems and maintains relatively high solution efficiency.
arXiv Detail & Related papers (2024-05-30T15:55:04Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood [10.35315334180936]
We propose a novel approach to the problem: Local Intrinsic Dimension estimation using approximate Likelihood (LIDL)
Our method relies on an arbitrary density estimation method as its subroutine and hence tries to sidestep the dimensionality challenge.
We show that LIDL yields competitive results on the standard benchmarks for this problem and that it scales to thousands of dimensions.
arXiv Detail & Related papers (2022-06-29T19:47:46Z) - Density Ratio Estimation via Infinitesimal Classification [85.08255198145304]
We propose DRE-infty, a divide-and-conquer approach to reduce Density ratio estimation (DRE) to a series of easier subproblems.
Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions.
We show that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.
arXiv Detail & Related papers (2021-11-22T06:26:29Z) - Featurized Density Ratio Estimation [82.40706152910292]
In our work, we propose to leverage an invertible generative model to map the two distributions into a common feature space prior to estimation.
This featurization brings the densities closer together in latent space, sidestepping pathological scenarios where the learned density ratios in input space can be arbitrarily inaccurate.
At the same time, the invertibility of our feature map guarantees that the ratios computed in feature space are equivalent to those in input space.
arXiv Detail & Related papers (2021-07-05T18:30:26Z) - Meta-Learning for Relative Density-Ratio Estimation [59.75321498170363]
Existing methods for (relative) density-ratio estimation (DRE) require many instances from both densities.
We propose a meta-learning method for relative DRE, which estimates the relative density-ratio from a few instances by using knowledge in related datasets.
We empirically demonstrate the effectiveness of the proposed method by using three problems: relative DRE, dataset comparison, and outlier detection.
arXiv Detail & Related papers (2021-07-02T02:13:45Z) - A Graph-based approach to derive the geodesic distance on Statistical
manifolds: Application to Multimedia Information Retrieval [5.1388648724853825]
We leverage the properties of non-Euclidean Geometry to define the Geodesic distance.
We propose an approximation of the Geodesic distance through a graph-based method.
Our main aim is to compare the graph-based approximation to the state of the art approximations.
arXiv Detail & Related papers (2021-06-26T16:39:54Z) - BikNN: Anomaly Estimation in Bilateral Domains with k-Nearest Neighbors [1.2183405753834562]
A novel framework for anomaly estimation is proposed in this paper.
We attempt to estimate the degree of anomaly in both spatial and density domains.
Our method takes into account both the spatial domain and the density domain and can be adapted to different datasets by adjusting a few parameters manually.
arXiv Detail & Related papers (2021-05-11T13:45:29Z) - Learning Optical Flow from a Few Matches [67.83633948984954]
We show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it.
Experiments show that our method can reduce computational cost and memory use significantly, while maintaining high accuracy.
arXiv Detail & Related papers (2021-04-05T21:44:00Z) - Nonparametric Density Estimation from Markov Chains [68.8204255655161]
We introduce a new nonparametric density estimator inspired by Markov Chains, and generalizing the well-known Kernel Density Estor.
Our estimator presents several benefits with respect to the usual ones and can be used straightforwardly as a foundation in all density-based algorithms.
arXiv Detail & Related papers (2020-09-08T18:33:42Z) - Variable Skipping for Autoregressive Range Density Estimation [84.60428050170687]
We show a technique, variable skipping, for accelerating range density estimation over deep autoregressive models.
We show that variable skipping provides 10-100$times$ efficiency improvements when targeting challenging high-quantile error metrics.
arXiv Detail & Related papers (2020-07-10T19:01:40Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.