Nonparametric mixture MLEs under Gaussian-smoothed optimal transport
distance
- URL: http://arxiv.org/abs/2112.02421v1
- Date: Sat, 4 Dec 2021 20:05:58 GMT
- Title: Nonparametric mixture MLEs under Gaussian-smoothed optimal transport
distance
- Authors: Fang Han, Zhen Miao, and Yandi Shen
- Abstract summary: We adapt the GOT framework instead of its unsmoothed counterpart to approximate the true data generating distribution.
A key step in our analysis is the establishment of a new Jackson-type approximation bound of Gaussian-convoluted Lipschitz functions.
This insight bridges existing techniques of analyzing the nonparametric MLEs and the new GOT framework.
- Score: 0.39373541926236766
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Gaussian-smoothed optimal transport (GOT) framework, pioneered in
Goldfeld et al. (2020) and followed up by a series of subsequent papers, has
quickly caught attention among researchers in statistics, machine learning,
information theory, and related fields. One key observation made therein is
that, by adapting to the GOT framework instead of its unsmoothed counterpart,
the curse of dimensionality for using the empirical measure to approximate the
true data generating distribution can be lifted. The current paper shows that a
related observation applies to the estimation of nonparametric mixing
distributions in discrete exponential family models, where under the GOT cost
the estimation accuracy of the nonparametric MLE can be accelerated to a
polynomial rate. This is in sharp contrast to the classical sub-polynomial
rates based on unsmoothed metrics, which cannot be improved from an
information-theoretical perspective. A key step in our analysis is the
establishment of a new Jackson-type approximation bound of Gaussian-convoluted
Lipschitz functions. This insight bridges existing techniques of analyzing the
nonparametric MLEs and the new GOT framework.
Related papers
- Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - von Mises Quasi-Processes for Bayesian Circular Regression [57.88921637944379]
We explore a family of expressive and interpretable distributions over circle-valued random functions.
The resulting probability model has connections with continuous spin models in statistical physics.
For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2024-06-19T01:57:21Z) - Semi-parametric Expert Bayesian Network Learning with Gaussian Processes
and Horseshoe Priors [26.530289799110562]
This paper proposes a model learning Semi-parametric rela- tionships in an Expert Bayesian Network (SEBN)
We use Gaussian Pro- cesses and a Horseshoe prior to introduce minimal nonlin- ear components.
In real-world datasets with unknown truth, we gen- erate diverse graphs to accommodate user input, addressing identifiability issues and enhancing interpretability.
arXiv Detail & Related papers (2024-01-29T18:57:45Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Information Theoretic Structured Generative Modeling [13.117829542251188]
A novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible.
The implementation employs a single neural network driven by an orthonormal input to a single white noise source adapted to learn an infinite Gaussian mixture model.
Preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as for training adversarial networks.
arXiv Detail & Related papers (2021-10-12T07:44:18Z) - Machine Learning and Variational Algorithms for Lattice Field Theory [1.198562319289569]
In lattice quantum field theory studies, parameters defining the lattice theory must be tuned toward criticality to access continuum physics.
We introduce an approach to "deform" Monte Carlo estimators based on contour deformations applied to the domain of the path integral.
We demonstrate that flow-based MCMC can mitigate critical slowing down and observifolds can exponentially reduce variance in proof-of-principle applications.
arXiv Detail & Related papers (2021-06-03T16:37:05Z) - A Riemannian Newton Trust-Region Method for Fitting Gaussian Mixture
Models [0.0]
We introduce a formula for the Riemannian Hessian for Gaussian Mixture Models.
On top, we propose a new Newton Trust-Region method which outperforms current approaches both in terms of runtime and number of iterations.
arXiv Detail & Related papers (2021-04-30T12:48:32Z) - Reducing the Variance of Variational Estimates of Mutual Information by
Limiting the Critic's Hypothesis Space to RKHS [0.0]
Mutual information (MI) is an information-theoretic measure of dependency between two random variables.
Recent methods realize parametric probability distributions or critic as a neural network to approximate unknown density ratios.
We argue that the high variance characteristic is due to the uncontrolled complexity of the critic's hypothesis space.
arXiv Detail & Related papers (2020-11-17T14:32:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.