Improved Particle Approximation Error for Mean Field Neural Networks
- URL: http://arxiv.org/abs/2405.15767v3
- Date: Wed, 30 Oct 2024 14:24:34 GMT
- Title: Improved Particle Approximation Error for Mean Field Neural Networks
- Authors: Atsushi Nitanda,
- Abstract summary: Mean-field Langevin dynamics (MFLD) minimizes an entropy-regularized nonlinear convex functional defined over the space of probability distributions.
Recent works have demonstrated the uniform-in-time propagation of chaos for MFLD.
We improve the dependence on logarithmic Sobolev inequality (LSI) constants in their particle approximation errors.
- Score: 9.817855108627452
- License:
- Abstract: Mean-field Langevin dynamics (MFLD) minimizes an entropy-regularized nonlinear convex functional defined over the space of probability distributions. MFLD has gained attention due to its connection with noisy gradient descent for mean-field two-layer neural networks. Unlike standard Langevin dynamics, the nonlinearity of the objective functional induces particle interactions, necessitating multiple particles to approximate the dynamics in a finite-particle setting. Recent works (Chen et al., 2022; Suzuki et al., 2023b) have demonstrated the uniform-in-time propagation of chaos for MFLD, showing that the gap between the particle system and its mean-field limit uniformly shrinks over time as the number of particles increases. In this work, we improve the dependence on logarithmic Sobolev inequality (LSI) constants in their particle approximation errors, which can exponentially deteriorate with the regularization coefficient. Specifically, we establish an LSI-constant-free particle approximation error concerning the objective gap by leveraging the problem structure in risk minimization. As the application, we demonstrate improved convergence of MFLD, sampling guarantee for the mean-field stationary distribution, and uniform-in-time Wasserstein propagation of chaos in terms of particle complexity.
Related papers
- Kinetic Interacting Particle Langevin Monte Carlo [0.0]
This paper introduces and analyses interacting underdamped Langevin algorithms, for statistical inference in latent variable models.
We propose a diffusion process that evolves jointly in the space of parameters and latent variables.
We provide two explicit discretisations of this diffusion as practical algorithms to estimate parameters of statistical models.
arXiv Detail & Related papers (2024-07-08T09:52:46Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Reducing defect production in random transverse-field Ising chains by
inhomogeneous driving fields [0.0]
In transverse-field Ising models, disorder in the couplings gives rise to a drastic reduction of the critical energy gap.
We show that the scaling of defect density with annealing time can be made algebraic by balancing the coupling disorder with suitably chosen driving fields.
We also study defect production during an environment-temperature quench of the open variant of the model in which the system is slowly cooled down to its quantum critical point.
arXiv Detail & Related papers (2023-09-22T12:28:22Z) - Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood
Estimation [2.53740603524637]
We develop a class of interacting particle systems for implementing a maximum marginal likelihood estimation procedure.
In particular, we prove that the parameter marginal of the stationary measure of this diffusion has the form of a Gibbs measure.
Using a particular rescaling, we then prove geometric ergodicity of this system and bound the discretisation error.
in a manner that is uniform in time and does not increase with the number of particles.
arXiv Detail & Related papers (2023-03-23T16:50:08Z) - Learning Discretized Neural Networks under Ricci Flow [51.36292559262042]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations.
DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z) - Locality of Spontaneous Symmetry Breaking and Universal Spacing
Distribution of Topological Defects Formed Across a Phase Transition [62.997667081978825]
A continuous phase transition results in the formation of topological defects with a density predicted by the Kibble-Zurek mechanism (KZM)
We characterize the spatial distribution of point-like topological defects in the resulting nonequilibrium state and model it using a Poisson point process in arbitrary spatial dimension with KZM density.
arXiv Detail & Related papers (2022-02-23T19:00:06Z) - Quantum correlations, entanglement spectrum and coherence of
two-particle reduced density matrix in the Extended Hubbard Model [62.997667081978825]
We study the ground state properties of the one-dimensional extended Hubbard model at half-filling.
In particular, in the superconducting region, we obtain that the entanglement spectrum signals a transition between a dominant singlet (SS) to triplet (TS) pairing ordering in the system.
arXiv Detail & Related papers (2021-10-29T21:02:24Z) - Statistical mechanics of one-dimensional quantum droplets [0.0]
We study the dynamical relaxation process of modulationally unstable one-dimensional quantum droplets.
We find that the instability leads to the spontaneous formation of quantum droplets featuring multiple collisions.
arXiv Detail & Related papers (2021-02-25T15:30:30Z) - Hessian-Free High-Resolution Nesterov Acceleration for Sampling [55.498092486970364]
Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed.
This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods.
arXiv Detail & Related papers (2020-06-16T15:07:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.