Dimensionality Reduction for General KDE Mode Finding
- URL: http://arxiv.org/abs/2305.18755v3
- Date: Thu, 1 Jun 2023 22:51:38 GMT
- Title: Dimensionality Reduction for General KDE Mode Finding
- Authors: Xinyu Luo, Christopher Musco, Cas Widdershoven
- Abstract summary: Finding the mode of a high dimensional probability distribution $D$ is a fundamental problem in statistics and data analysis.
We show that there is no time algorithm for finding the mode of a kernel density estimate, unless $mathitP = mathitNP$.
- Score: 12.779486428760373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finding the mode of a high dimensional probability distribution $D$ is a
fundamental algorithmic problem in statistics and data analysis. There has been
particular interest in efficient methods for solving the problem when $D$ is
represented as a mixture model or kernel density estimate, although few
algorithmic results with worst-case approximation and runtime guarantees are
known. In this work, we significantly generalize a result of (LeeLiMusco:2021)
on mode approximation for Gaussian mixture models. We develop randomized
dimensionality reduction methods for mixtures involving a broader class of
kernels, including the popular logistic, sigmoid, and generalized Gaussian
kernels. As in Lee et al.'s work, our dimensionality reduction results yield
quasi-polynomial algorithms for mode finding with multiplicative accuracy
$(1-\epsilon)$ for any $\epsilon > 0$. Moreover, when combined with gradient
descent, they yield efficient practical heuristics for the problem. In addition
to our positive results, we prove a hardness result for box kernels, showing
that there is no polynomial time algorithm for finding the mode of a kernel
density estimate, unless $\mathit{P} = \mathit{NP}$. Obtaining similar hardness
results for kernels used in practice (like Gaussian or logistic kernels) is an
interesting future direction.
Related papers
- Learning general Gaussian mixtures with efficient score matching [16.06356123715737]
We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions.
We make no separation assumptions on the underlying mixture components.
We give an algorithm that draws $dmathrmpoly(k/varepsilon)$ samples from the target mixture, runs in sample-polynomial time, and constructs a sampler.
arXiv Detail & Related papers (2024-04-29T17:30:36Z) - Learning Mixtures of Gaussians Using Diffusion Models [9.118706387430883]
We give a new algorithm for learning mixtures of $k$ Gaussians to TV error.
Our approach is analytic and relies on the framework of diffusion models.
arXiv Detail & Related papers (2024-04-29T17:00:20Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - Kernel Packet: An Exact and Scalable Algorithm for Gaussian Process
Regression with Mat\'ern Correlations [23.560067934682294]
We develop an exact and scalable algorithm for one-dimensional Gaussian process regression with Mat'ern correlations.
The proposed algorithm is significantly superior to the existing alternatives in both the computational time and predictive accuracy.
arXiv Detail & Related papers (2022-03-07T03:30:35Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Projected Stochastic Gradient Langevin Algorithms for Constrained
Sampling and Non-Convex Learning [0.0]
Langevin algorithms are methods with additive noise.
Langevin algorithms have been used for decades in chain Carlo (Milon)
For learning, we show that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that
arXiv Detail & Related papers (2020-12-22T16:19:20Z) - Fourier Sparse Leverage Scores and Approximate Kernel Learning [29.104055676527913]
We prove new explicit upper bounds on the leverage scores of Fourier functions under both the Gaussian and Laplace measures.
Our bounds are motivated by two important applications in machine learning.
arXiv Detail & Related papers (2020-06-12T17:25:39Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.