Related papers: Improved Particle Approximation Error for Mean Field Neural Networks

Improved Particle Approximation Error for Mean Field Neural Networks

URL: http://arxiv.org/abs/2405.15767v3
Date: Wed, 30 Oct 2024 14:24:34 GMT
Title: Improved Particle Approximation Error for Mean Field Neural Networks
Authors: Atsushi Nitanda,
Abstract summary: Mean-field Langevin dynamics (MFLD) minimizes an entropy-regularized nonlinear convex functional defined over the space of probability distributions. Recent works have demonstrated the uniform-in-time propagation of chaos for MFLD. We improve the dependence on logarithmic Sobolev inequality (LSI) constants in their particle approximation errors.
Score: 9.817855108627452
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mean-field Langevin dynamics (MFLD) minimizes an entropy-regularized nonlinear convex functional defined over the space of probability distributions. MFLD has gained attention due to its connection with noisy gradient descent for mean-field two-layer neural networks. Unlike standard Langevin dynamics, the nonlinearity of the objective functional induces particle interactions, necessitating multiple particles to approximate the dynamics in a finite-particle setting. Recent works (Chen et al., 2022; Suzuki et al., 2023b) have demonstrated the uniform-in-time propagation of chaos for MFLD, showing that the gap between the particle system and its mean-field limit uniformly shrinks over time as the number of particles increases. In this work, we improve the dependence on logarithmic Sobolev inequality (LSI) constants in their particle approximation errors, which can exponentially deteriorate with the regularization coefficient. Specifically, we establish an LSI-constant-free particle approximation error concerning the objective gap by leveraging the problem structure in risk minimization. As the application, we demonstrate improved convergence of MFLD, sampling guarantee for the mean-field stationary distribution, and uniform-in-time Wasserstein propagation of chaos in terms of particle complexity.

Related papers

Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time [39.09304480125516]
We study the approximation gap between the dynamics of a-width neural network and its infinite-width counterpart. We demonstrate how to tightly bound this approximation gap through a differential equation governed by the mean-field dynamics.
arXiv Detail & Related papers (2025-04-17T17:24:38Z)
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble [36.19164064733151]
Mean-field Langevin dynamics (MFLD) is an optimization method derived by taking the mean-field limit of noisy gradient descent for two-layer neural networks. Recent work shows that the approximation error due to finite particles remains uniform in time and diminishes as the number of particles increases. In this paper, we establish an improved PoC result for MFLD, which removes the exponential dependence on the regularization coefficient from the particle approximation term.
arXiv Detail & Related papers (2025-02-09T05:58:46Z)
Uniform-in-time weak propagation of chaos for consensus-based optimization [4.533408985664949]
We study the uniform-in-time weak propagation of chaos for the consensus-based optimization (CBO) method on a bounded searching domain. Our work shows that the weak error has order $O(N-1)$ uniformly in time, where $N$ denotes the number of particles.
arXiv Detail & Related papers (2025-02-01T22:38:10Z)
On the Convergence of Min-Max Langevin Dynamics and Algorithm [15.132772939268989]
We study zero-sum games in the space of probability distributions over the Euclidean space $mathbbRd$ with entropy regularization. We prove an exponential convergence guarantee for the mean-field min-max Langevin dynamics to compute the equilibrium distribution.
arXiv Detail & Related papers (2024-12-29T14:20:23Z)
Kinetic Interacting Particle Langevin Monte Carlo [0.0]
This paper introduces and analyses interacting underdamped Langevin algorithms, for statistical inference in latent variable models. We propose a diffusion process that evolves jointly in the space of parameters and latent variables. We provide two explicit discretisations of this diffusion as practical algorithms to estimate parameters of statistical models.
arXiv Detail & Related papers (2024-07-08T09:52:46Z)
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates. We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z)
Reducing defect production in random transverse-field Ising chains by inhomogeneous driving fields [0.0]
In transverse-field Ising models, disorder in the couplings gives rise to a drastic reduction of the critical energy gap. We show that the scaling of defect density with annealing time can be made algebraic by balancing the coupling disorder with suitably chosen driving fields. We also study defect production during an environment-temperature quench of the open variant of the model in which the system is slowly cooled down to its quantum critical point.
arXiv Detail & Related papers (2023-09-22T12:28:22Z)
Convergence of mean-field Langevin dynamics: Time and space discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift. Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures. We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z)
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation [2.53740603524637]
We develop a class of interacting particle systems for implementing a maximum marginal likelihood estimation procedure. In particular, we prove that the parameter marginal of the stationary measure of this diffusion has the form of a Gibbs measure. Using a particular rescaling, we then prove geometric ergodicity of this system and bound the discretisation error. in a manner that is uniform in time and does not increase with the number of particles.
arXiv Detail & Related papers (2023-03-23T16:50:08Z)
Learning Discretized Neural Networks under Ricci Flow [51.36292559262042]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations. DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z)
Locality of Spontaneous Symmetry Breaking and Universal Spacing Distribution of Topological Defects Formed Across a Phase Transition [62.997667081978825]
A continuous phase transition results in the formation of topological defects with a density predicted by the Kibble-Zurek mechanism (KZM) We characterize the spatial distribution of point-like topological defects in the resulting nonequilibrium state and model it using a Poisson point process in arbitrary spatial dimension with KZM density.
arXiv Detail & Related papers (2022-02-23T19:00:06Z)
Convex Analysis of the Mean Field Langevin Dynamics [49.66486092259375]
convergence rate analysis of the mean field Langevin dynamics is presented. $p_q$ associated with the dynamics allows us to develop a convergence theory parallel to classical results in convex optimization.
arXiv Detail & Related papers (2022-01-25T17:13:56Z)
Quantum correlations, entanglement spectrum and coherence of two-particle reduced density matrix in the Extended Hubbard Model [62.997667081978825]
We study the ground state properties of the one-dimensional extended Hubbard model at half-filling. In particular, in the superconducting region, we obtain that the entanglement spectrum signals a transition between a dominant singlet (SS) to triplet (TS) pairing ordering in the system.
arXiv Detail & Related papers (2021-10-29T21:02:24Z)
Statistical mechanics of one-dimensional quantum droplets [0.0]
We study the dynamical relaxation process of modulationally unstable one-dimensional quantum droplets. We find that the instability leads to the spontaneous formation of quantum droplets featuring multiple collisions.
arXiv Detail & Related papers (2021-02-25T15:30:30Z)
Hessian-Free High-Resolution Nesterov Acceleration for Sampling [55.498092486970364]
Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed. This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods.
arXiv Detail & Related papers (2020-06-16T15:07:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.