Density Estimation via Binless Multidimensional Integration
- URL: http://arxiv.org/abs/2407.08094v2
- Date: Sun, 14 Jul 2024 14:38:16 GMT
- Title: Density Estimation via Binless Multidimensional Integration
- Authors: Matteo Carli, Alex Rodriguez, Alessandro Laio, Aldo Glielmo,
- Abstract summary: We introduce the Binless Multidimensional Thermodynamic Integration (BMTI) method for nonparametric, robust, and data-efficient density estimation.
BMTI estimates the logarithm of the density by initially computing log-density differences between neighbouring data points.
The method is tested on a variety of complex synthetic high-dimensional datasets, and is benchmarked on realistic datasets from the chemical physics literature.
- Score: 45.21975243399607
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce the Binless Multidimensional Thermodynamic Integration (BMTI) method for nonparametric, robust, and data-efficient density estimation. BMTI estimates the logarithm of the density by initially computing log-density differences between neighbouring data points. Subsequently, such differences are integrated, weighted by their associated uncertainties, using a maximum-likelihood formulation. This procedure can be seen as an extension to a multidimensional setting of the thermodynamic integration, a technique developed in statistical physics. The method leverages the manifold hypothesis, estimating quantities within the intrinsic data manifold without defining an explicit coordinate map. It does not rely on any binning or space partitioning, but rather on the construction of a neighbourhood graph based on an adaptive bandwidth selection procedure. BMTI mitigates the limitations commonly associated with traditional nonparametric density estimators, effectively reconstructing smooth profiles even in high-dimensional embedding spaces. The method is tested on a variety of complex synthetic high-dimensional datasets, where it is shown to outperform traditional estimators, and is benchmarked on realistic datasets from the chemical physics literature.
Related papers
- Learning Distances from Data with Normalizing Flows and Score Matching [9.605001452209867]
Density-based distances offer an elegant solution to the problem of metric learning.
We show that existing methods to estimate Fermat distances suffer from poor convergence in both low and high dimensions.
Our work paves the way for practical use of density-based distances, especially in high-dimensional spaces.
arXiv Detail & Related papers (2024-07-12T14:30:41Z) - Minimizing robust density power-based divergences for general parametric
density models [3.0277213703725767]
We introduce an approach to minimize Density power divergence (DPD) for general parametric densities.
The proposed approach can also be employed to minimize other density power-based $gamma$-divergences.
arXiv Detail & Related papers (2023-07-11T13:33:47Z) - Solving High-Dimensional PDEs with Latent Spectral Models [74.1011309005488]
We present Latent Spectral Models (LSM) toward an efficient and precise solver for high-dimensional PDEs.
Inspired by classical spectral methods in numerical analysis, we design a neural spectral block to solve PDEs in the latent space.
LSM achieves consistent state-of-the-art and yields a relative gain of 11.5% averaged on seven benchmarks.
arXiv Detail & Related papers (2023-01-30T04:58:40Z) - AD-DMKDE: Anomaly Detection through Density Matrices and Fourier
Features [0.0]
The method can be seen as an efficient approximation of Kernel Density Estimation (KDE)
A systematic comparison of the proposed method with eleven state-of-the-art anomaly detection methods on various data sets is presented.
arXiv Detail & Related papers (2022-10-26T15:43:16Z) - Quantum Adaptive Fourier Features for Neural Density Estimation [0.0]
This paper presents a method for neural density estimation that can be seen as a type of kernel density estimation.
The method is based on density matrices, a formalism used in quantum mechanics, and adaptive Fourier features.
The method was evaluated in different synthetic and real datasets, and its performance compared against state-of-the-art neural density estimation methods.
arXiv Detail & Related papers (2022-08-01T01:39:11Z) - Nonlinear Isometric Manifold Learning for Injective Normalizing Flows [58.720142291102135]
We use isometries to separate manifold learning and density estimation.
We also employ autoencoders to design embeddings with explicit inverses that do not distort the probability distribution.
arXiv Detail & Related papers (2022-03-08T08:57:43Z) - Density-Based Clustering with Kernel Diffusion [59.4179549482505]
A naive density corresponding to the indicator function of a unit $d$-dimensional Euclidean ball is commonly used in density-based clustering algorithms.
We propose a new kernel diffusion density function, which is adaptive to data of varying local distributional characteristics and smoothness.
arXiv Detail & Related papers (2021-10-11T09:00:33Z) - Featurized Density Ratio Estimation [82.40706152910292]
In our work, we propose to leverage an invertible generative model to map the two distributions into a common feature space prior to estimation.
This featurization brings the densities closer together in latent space, sidestepping pathological scenarios where the learned density ratios in input space can be arbitrarily inaccurate.
At the same time, the invertibility of our feature map guarantees that the ratios computed in feature space are equivalent to those in input space.
arXiv Detail & Related papers (2021-07-05T18:30:26Z) - Improving Metric Dimensionality Reduction with Distributed Topology [68.8204255655161]
DIPOLE is a dimensionality-reduction post-processing step that corrects an initial embedding by minimizing a loss functional with both a local, metric term and a global, topological term.
We observe that DIPOLE outperforms popular methods like UMAP, t-SNE, and Isomap on a number of popular datasets.
arXiv Detail & Related papers (2021-06-14T17:19:44Z) - High-Dimensional Non-Parametric Density Estimation in Mixed Smooth
Sobolev Spaces [31.663702435594825]
Density estimation plays a key role in many tasks in machine learning, statistical inference, and visualization.
Main bottleneck in high-dimensional density estimation is the prohibitive computational cost and the slow convergence rate.
We propose novel estimators for high-dimensional non-parametric density estimation called the adaptive hyperbolic cross density estimators.
arXiv Detail & Related papers (2020-06-05T21:27:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.