On quantitative Laplace-type convergence results for some exponential
probability measures, with two applications
- URL: http://arxiv.org/abs/2110.12922v1
- Date: Mon, 25 Oct 2021 13:00:25 GMT
- Title: On quantitative Laplace-type convergence results for some exponential
probability measures, with two applications
- Authors: Valentin De Bortoli, Agn\`es Desolneux
- Abstract summary: We find a limit of the sequence of measures $(pi_varepsilon)_varepsilon >0$ with density w.r.t the Lebesgue measure $(mathrmd pi_varepsilon)_varepsilon >0$ with density w.r.t the Lebesgue measure $(mathrmd pi_varepsilon)_varepsilon >0$ with density w.r.t the Lebesgue measure $(mathrmd
- Score: 2.9189409618561966
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Laplace-type results characterize the limit of sequence of measures
$(\pi_\varepsilon)_{\varepsilon >0}$ with density w.r.t the Lebesgue measure
$(\mathrm{d} \pi_\varepsilon / \mathrm{d} \mathrm{Leb})(x) \propto
\exp[-U(x)/\varepsilon]$ when the temperature $\varepsilon>0$ converges to $0$.
If a limiting distribution $\pi_0$ exists, it concentrates on the minimizers of
the potential $U$. Classical results require the invertibility of the Hessian
of $U$ in order to establish such asymptotics. In this work, we study the
particular case of norm-like potentials $U$ and establish quantitative bounds
between $\pi_\varepsilon$ and $\pi_0$ w.r.t. the Wasserstein distance of order
$1$ under an invertibility condition of a generalized Jacobian. One key element
of our proof is the use of geometric measure theory tools such as the coarea
formula. We apply our results to the study of maximum entropy models
(microcanonical/macrocanonical distributions) and to the convergence of the
iterates of the Stochastic Gradient Langevin Dynamics (SGLD) algorithm at low
temperatures for non-convex minimization.
Related papers
- Estimation and Inference in Distributional Reinforcement Learning [28.253677740976197]
We show that a dataset of size $widetilde Oleft(frac|mathcalS||mathcalA|epsilon2 (1-gamma)4right)$ suffices to ensure the Kolmogorov metric and total variation metric between $hatetapi$ and $etapi$ is below $epsilon$ with high probability.
Our findings give rise to a unified approach to statistical inference of a wide class of statistical functionals of $etapi$.
arXiv Detail & Related papers (2023-09-29T14:14:53Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Efficient Sampling of Stochastic Differential Equations with Positive
Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix.
It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$
Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z) - A note on $L^1$-Convergence of the Empiric Minimizer for unbounded
functions with fast growth [0.0]
For $V : mathbbRd to mathbbR$ coercive, we study the convergence rate for the $L1$-distance of the empiric minimizer.
We show that in general, for unbounded functions with fast growth, the convergence rate is bounded above by $a_n n-1/q$, where $q$ is the dimension of the latent random variable.
arXiv Detail & Related papers (2023-03-08T08:46:13Z) - Sparse Signal Detection in Heteroscedastic Gaussian Sequence Models:
Sharp Minimax Rates [1.0309387309011746]
We study the signal detection problem against sparse alternatives, for known sparsity $s$.
We find minimax upper and lower bounds over the minimax separation radius $epsilon*$ and prove that they are always matching.
Our results reveal new phase transitions regarding the behavior of $epsilon*$ with respect to the level of sparsity, to the $Lt$ metric, and to the heteroscedasticity profile of $Sigma$.
arXiv Detail & Related papers (2022-11-15T23:53:39Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - Simulated annealing from continuum to discretization: a convergence
analysis via the Eyring--Kramers law [10.406659081400354]
We study the convergence rate of continuous-time simulated annealing $(X_t;, t ge 0)$ and its discretization $(x_k;, k =0,1, ldots)$
We prove that the tail probability $mathbbP(f(X_t) > min f +delta)$ (resp. $mathP(f(x_k) > min f +delta)$) decays in time (resp. in cumulative step size)
arXiv Detail & Related papers (2021-02-03T23:45:39Z) - Improved Sample Complexity for Incremental Autonomous Exploration in
MDPs [132.88757893161699]
We learn the set of $epsilon$-optimal goal-conditioned policies attaining all states that are incrementally reachable within $L$ steps.
DisCo is the first algorithm that can return an $epsilon/c_min$-optimal policy for any cost-sensitive shortest-path problem.
arXiv Detail & Related papers (2020-12-29T14:06:09Z) - Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist.
We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z) - Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence [8.873449722727026]
We show that the rate estimate $widetildemathcalO(depsilon-1)$ improves the previously known rates in both of these metrics.
In particular, for convex and firstorder smooth potentials, we show that LMC algorithm achieves the rate estimate $widetildemathcalO(depsilon-1)$ which improves the previously known rates in both of these metrics.
arXiv Detail & Related papers (2020-07-22T18:18:28Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.