LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood
- URL: http://arxiv.org/abs/2206.14882v1
- Date: Wed, 29 Jun 2022 19:47:46 GMT
- Title: LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood
- Authors: Piotr Tempczyk, Rafa{\l} Michaluk,{\L}ukasz Garncarek, Przemys{\l}aw
Spurek, Jacek Tabor, Adam Goli\'nski
- Abstract summary: We propose a novel approach to the problem: Local Intrinsic Dimension estimation using approximate Likelihood (LIDL)
Our method relies on an arbitrary density estimation method as its subroutine and hence tries to sidestep the dimensionality challenge.
We show that LIDL yields competitive results on the standard benchmarks for this problem and that it scales to thousands of dimensions.
- Score: 10.35315334180936
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most of the existing methods for estimating the local intrinsic dimension of
a data distribution do not scale well to high-dimensional data. Many of them
rely on a non-parametric nearest neighbors approach which suffers from the
curse of dimensionality. We attempt to address that challenge by proposing a
novel approach to the problem: Local Intrinsic Dimension estimation using
approximate Likelihood (LIDL). Our method relies on an arbitrary density
estimation method as its subroutine and hence tries to sidestep the
dimensionality challenge by making use of the recent progress in parametric
neural methods for likelihood estimation. We carefully investigate the
empirical properties of the proposed method, compare them with our theoretical
predictions, and show that LIDL yields competitive results on the standard
benchmarks for this problem and that it scales to thousands of dimensions. What
is more, we anticipate this approach to improve further with the continuing
advances in the density estimation literature.
Related papers
- Learning Distances from Data with Normalizing Flows and Score Matching [9.605001452209867]
Density-based distances offer an elegant solution to the problem of metric learning.
We show that existing methods to estimate Fermat distances suffer from poor convergence in both low and high dimensions.
Our work paves the way for practical use of density-based distances, especially in high-dimensional spaces.
arXiv Detail & Related papers (2024-07-12T14:30:41Z) - Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks [26.336947440529713]
$k$-Nearest Neighbor Uncertainty Estimation ($k$NN-UE) is an uncertainty estimation method that uses the distances from the neighbors and label-existence ratio of neighbors.
Our experiments show that our proposed method outperforms the baselines or recent density-based methods in confidence calibration, selective prediction, and out-of-distribution detection.
arXiv Detail & Related papers (2024-07-02T10:33:31Z) - A Wiener process perspective on local intrinsic dimension estimation methods [1.6988007266875604]
Local intrinsic (LID) estimation methods have received a lot of attention in recent years thanks to the progress in deep neural networks and generative modeling.
In this paper, we investigate the recent state-of-the-art parametric LID estimation methods from the perspective of the Wiener process.
arXiv Detail & Related papers (2024-06-24T20:27:13Z) - A Finite-Horizon Approach to Active Level Set Estimation [0.7366405857677227]
We consider the problem of active learning in the context of spatial sampling for level set estimation (LSE)
We present a finite-horizon search procedure to perform LSE in one dimension while optimally balancing both the final estimation error and the distance traveled for a fixed number of samples.
We show that the resulting optimization problem can be solved in closed form and that the resulting policy generalizes existing approaches to this problem.
arXiv Detail & Related papers (2023-10-18T14:11:41Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - Continuous Wasserstein-2 Barycenter Estimation without Minimax
Optimization [94.18714844247766]
Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport.
We present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures.
arXiv Detail & Related papers (2021-02-02T21:01:13Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Estimating Barycenters of Measures in High Dimensions [30.563217903502807]
We propose a scalable and general algorithm for estimating barycenters of measures in high dimensions.
We prove local convergence under mild assumptions on the discrepancy showing that the approach is well-posed.
Our approach is the first to be used to estimate barycenters in thousands of dimensions.
arXiv Detail & Related papers (2020-07-14T15:24:41Z) - Variable Skipping for Autoregressive Range Density Estimation [84.60428050170687]
We show a technique, variable skipping, for accelerating range density estimation over deep autoregressive models.
We show that variable skipping provides 10-100$times$ efficiency improvements when targeting challenging high-quantile error metrics.
arXiv Detail & Related papers (2020-07-10T19:01:40Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.