Sample Complexity of Probability Divergences under Group Symmetry
- URL: http://arxiv.org/abs/2302.01915v2
- Date: Mon, 22 May 2023 22:29:07 GMT
- Title: Sample Complexity of Probability Divergences under Group Symmetry
- Authors: Ziyu Chen, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu
- Abstract summary: In the cases of the Wasserstein-1 metric and the Lipschitz-regularized $alpha$-divergences, the reduction of sample complexity is proportional to an ambient-dimension-dependent power of the group size.
For the maximum mean discrepancy (MMD), the improvement of sample complexity is more nuanced, as it depends on not only the group size but also the choice of kernel.
- Score: 9.478466072397708
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We rigorously quantify the improvement in the sample complexity of
variational divergence estimations for group-invariant distributions. In the
cases of the Wasserstein-1 metric and the Lipschitz-regularized
$\alpha$-divergences, the reduction of sample complexity is proportional to an
ambient-dimension-dependent power of the group size. For the maximum mean
discrepancy (MMD), the improvement of sample complexity is more nuanced, as it
depends on not only the group size but also the choice of kernel. Numerical
simulations verify our theories.
Related papers
- Covariance estimation using Markov chain Monte Carlo [2.209921757303168]
We show that when $pi$ satisfies a Poincar'e inequality and the chain possesses a spectral gap, we can achieve similar sample complexity using MCMC.
We provide guarantees regarding isotropic rounding procedures for sampling uniformly on convex bodies.
arXiv Detail & Related papers (2024-10-22T16:27:29Z) - Sample Complexity Bounds for Estimating Probability Divergences under Invariances [31.946304450935628]
Group-invariant probability distributions appear in many data-generative models in machine learning.
In this work, we study how the inherent invariances, with respect to any smooth action of a Lie group on a manifold, improve sample complexity.
Results are completely new for groups of positive dimension and extend recent bounds for finite group actions.
arXiv Detail & Related papers (2023-11-06T04:45:21Z) - The Exact Sample Complexity Gain from Invariances for Kernel Regression [37.74032673086741]
In practice, encoding invariances into models improves sample complexity.
We provide minimax optimal rates for kernel ridge regression on compact manifold.
Our results hold for any smooth compact Lie group action, even groups of positive dimension.
arXiv Detail & Related papers (2023-03-24T20:47:31Z) - Mean-Square Analysis of Discretized It\^o Diffusions for Heavy-tailed
Sampling [17.415391025051434]
We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of Ito diffusions associated with weighted Poincar'e inequalities.
Based on a mean-square analysis, we establish the iteration complexity for obtaining a sample whose distribution is $epsilon$ close to the target distribution in the Wasserstein-2 metric.
arXiv Detail & Related papers (2023-03-01T15:16:03Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Forster Decomposition and Learning Halfspaces with Noise [60.691817861402676]
A Forster transform is an operation that turns a distribution into one with good anti-concentration properties.
We show that any distribution can be decomposed efficiently as a disjoint mixture of few distributions for which a Forster transform exists and can be computed efficiently.
arXiv Detail & Related papers (2021-07-12T17:00:59Z) - Learning Equivariant Energy Based Models with Equivariant Stein
Variational Gradient Descent [80.73580820014242]
We focus on the problem of efficient sampling and learning of probability densities by incorporating symmetries in probabilistic models.
We first introduce Equivariant Stein Variational Gradient Descent algorithm -- an equivariant sampling method based on Stein's identity for sampling from densities with symmetries.
We propose new ways of improving and scaling up training of energy based models.
arXiv Detail & Related papers (2021-06-15T01:35:17Z) - On the Sample Complexity of Learning with Geometric Stability [42.813141600050166]
We study the sample complexity of learning problems where the target function presents such invariance and stability properties.
We provide non-parametric rates of convergence for kernel methods, and improvements in sample complexity by a factor equal to the size of the group.
arXiv Detail & Related papers (2021-06-14T03:51:16Z) - GANs with Variational Entropy Regularizers: Applications in Mitigating
the Mode-Collapse Issue [95.23775347605923]
Building on the success of deep learning, Generative Adversarial Networks (GANs) provide a modern approach to learn a probability distribution from observed samples.
GANs often suffer from the mode collapse issue where the generator fails to capture all existing modes of the input distribution.
We take an information-theoretic approach and maximize a variational lower bound on the entropy of the generated samples to increase their diversity.
arXiv Detail & Related papers (2020-09-24T19:34:37Z) - Interpolation and Learning with Scale Dependent Kernels [91.41836461193488]
We study the learning properties of nonparametric ridge-less least squares.
We consider the common case of estimators defined by scale dependent kernels.
arXiv Detail & Related papers (2020-06-17T16:43:37Z) - Distributed, partially collapsed MCMC for Bayesian Nonparametrics [68.5279360794418]
We exploit the fact that completely random measures, which commonly used models like the Dirichlet process and the beta-Bernoulli process can be expressed as, are decomposable into independent sub-measures.
We use this decomposition to partition the latent measure into a finite measure containing only instantiated components, and an infinite measure containing all other components.
The resulting hybrid algorithm can be applied to allow scalable inference without sacrificing convergence guarantees.
arXiv Detail & Related papers (2020-01-15T23:10:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.