Beyond Smoothness: Incorporating Low-Rank Analysis into Nonparametric
Density Estimation
- URL: http://arxiv.org/abs/2204.00930v1
- Date: Sat, 2 Apr 2022 19:45:07 GMT
- Title: Beyond Smoothness: Incorporating Low-Rank Analysis into Nonparametric
Density Estimation
- Authors: Robert A. Vandermeulen and Antoine Ledent
- Abstract summary: We introduce a new nonparametric latent variable model based on the Tucker decomposition.
A rudimentary implementation of our estimators experimentally demonstrates a considerable performance improvement over the standard histogram estimator.
- Score: 20.38883021295225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The construction and theoretical analysis of the most popular universally
consistent nonparametric density estimators hinge on one functional property:
smoothness. In this paper we investigate the theoretical implications of
incorporating a multi-view latent variable model, a type of low-rank model,
into nonparametric density estimation. To do this we perform extensive analysis
on histogram-style estimators that integrate a multi-view model. Our analysis
culminates in showing that there exists a universally consistent
histogram-style estimator that converges to any multi-view model with a finite
number of Lipschitz continuous components at a rate of
$\widetilde{O}(1/\sqrt[3]{n})$ in $L^1$ error. In contrast, the standard
histogram estimator can converge at a rate slower than $1/\sqrt[d]{n}$ on the
same class of densities. We also introduce a new nonparametric latent variable
model based on the Tucker decomposition. A rudimentary implementation of our
estimators experimentally demonstrates a considerable performance improvement
over the standard histogram estimator. We also provide a thorough analysis of
the sample complexity of our Tucker decomposition-based model and a variety of
other results. Thus, our paper provides solid theoretical foundations for
extending low-rank techniques to the nonparametric setting
Related papers
- Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration [5.548787731232499]
We focus on the Wasserstein convergence analysis of score-based diffusion models.
We compare various discretization schemes, including Euler discretization, exponential midpoint and randomization methods.
We propose an accelerated sampler based on the local linearization method.
arXiv Detail & Related papers (2025-02-07T11:37:51Z) - A Unified Analysis for Finite Weight Averaging [50.75116992029417]
Averaging iterations of Gradient Descent (SGD) have achieved empirical success in training deep learning models, such as Weight Averaging (SWA), Exponential Moving Average (EMA), and LAtest Weight Averaging (LAWA)
In this paper, we generalize LAWA as Finite Weight Averaging (FWA) and explain their advantages compared to SGD from the perspective of optimization and generalization.
arXiv Detail & Related papers (2024-11-20T10:08:22Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Mean-Square Analysis of Discretized It\^o Diffusions for Heavy-tailed
Sampling [17.415391025051434]
We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of Ito diffusions associated with weighted Poincar'e inequalities.
Based on a mean-square analysis, we establish the iteration complexity for obtaining a sample whose distribution is $epsilon$ close to the target distribution in the Wasserstein-2 metric.
arXiv Detail & Related papers (2023-03-01T15:16:03Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal [70.15267479220691]
We consider and analyze the sample complexity of model reinforcement learning with a generative variance-free model.
Our analysis shows that it is nearly minimax-optimal for finding an $varepsilon$-optimal policy when $varepsilon$ is sufficiently small.
arXiv Detail & Related papers (2022-05-27T19:39:24Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - The Heavy-Tail Phenomenon in SGD [7.366405857677226]
We show that depending on the structure of the Hessian of the loss at the minimum, the SGD iterates will converge to a emphheavy-tailed stationary distribution.
We translate our results into insights about the behavior of SGD in deep learning.
arXiv Detail & Related papers (2020-06-08T16:43:56Z) - Nonparametric Score Estimators [49.42469547970041]
Estimating the score from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models.
We provide a unifying view of these estimators under the framework of regularized nonparametric regression.
We propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.
arXiv Detail & Related papers (2020-05-20T15:01:03Z) - A Precise High-Dimensional Asymptotic Theory for Boosting and
Minimum-$\ell_1$-Norm Interpolated Classifiers [3.167685495996986]
This paper establishes a precise high-dimensional theory for boosting on separable data.
Under a class of statistical models, we provide an exact analysis of the universality error of boosting.
We also explicitly pin down the relation between the boosting test error and the optimal Bayes error.
arXiv Detail & Related papers (2020-02-05T00:24:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.