Improved Particle Approximation Error for Mean Field Neural Networks
- URL: http://arxiv.org/abs/2405.15767v3
- Date: Wed, 30 Oct 2024 14:24:34 GMT
- Title: Improved Particle Approximation Error for Mean Field Neural Networks
- Authors: Atsushi Nitanda,
- Abstract summary: Mean-field Langevin dynamics (MFLD) minimizes an entropy-regularized nonlinear convex functional defined over the space of probability distributions.
Recent works have demonstrated the uniform-in-time propagation of chaos for MFLD.
We improve the dependence on logarithmic Sobolev inequality (LSI) constants in their particle approximation errors.
- Score: 9.817855108627452
- License:
- Abstract: Mean-field Langevin dynamics (MFLD) minimizes an entropy-regularized nonlinear convex functional defined over the space of probability distributions. MFLD has gained attention due to its connection with noisy gradient descent for mean-field two-layer neural networks. Unlike standard Langevin dynamics, the nonlinearity of the objective functional induces particle interactions, necessitating multiple particles to approximate the dynamics in a finite-particle setting. Recent works (Chen et al., 2022; Suzuki et al., 2023b) have demonstrated the uniform-in-time propagation of chaos for MFLD, showing that the gap between the particle system and its mean-field limit uniformly shrinks over time as the number of particles increases. In this work, we improve the dependence on logarithmic Sobolev inequality (LSI) constants in their particle approximation errors, which can exponentially deteriorate with the regularization coefficient. Specifically, we establish an LSI-constant-free particle approximation error concerning the objective gap by leveraging the problem structure in risk minimization. As the application, we demonstrate improved convergence of MFLD, sampling guarantee for the mean-field stationary distribution, and uniform-in-time Wasserstein propagation of chaos in terms of particle complexity.
Related papers
- Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble [36.19164064733151]
Mean-field Langevin dynamics (MFLD) is an optimization method derived by taking the mean-field limit of noisy gradient descent for two-layer neural networks.
Recent work shows that the approximation error due to finite particles remains uniform in time and diminishes as the number of particles increases.
In this paper, we establish an improved PoC result for MFLD, which removes the exponential dependence on the regularization coefficient from the particle approximation term.
arXiv Detail & Related papers (2025-02-09T05:58:46Z) - Uniform-in-time weak propagation of chaos for consensus-based optimization [4.533408985664949]
We study the uniform-in-time weak propagation of chaos for the consensus-based optimization (CBO) method on a bounded searching domain.
Our work shows that the weak error has order $O(N-1)$ uniformly in time, where $N$ denotes the number of particles.
arXiv Detail & Related papers (2025-02-01T22:38:10Z) - On the Convergence of Min-Max Langevin Dynamics and Algorithm [15.132772939268989]
We study zero-sum games in the space of probability distributions over the Euclidean space $mathbbRd$ with entropy regularization.
We prove an exponential convergence guarantee for the mean-field min-max Langevin dynamics to compute the equilibrium distribution.
arXiv Detail & Related papers (2024-12-29T14:20:23Z) - Entanglement Transition due to particle losses in a monitored fermionic chain [0.0]
We study the dynamics of the entanglement entropy scaling under quantum entanglement jumps.
We show that by tuning the system parameters, a measurement-induced transition occurs where the entanglement entropy changes from logarithmic to area-law.
arXiv Detail & Related papers (2024-08-07T11:30:09Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Locality of Spontaneous Symmetry Breaking and Universal Spacing
Distribution of Topological Defects Formed Across a Phase Transition [62.997667081978825]
A continuous phase transition results in the formation of topological defects with a density predicted by the Kibble-Zurek mechanism (KZM)
We characterize the spatial distribution of point-like topological defects in the resulting nonequilibrium state and model it using a Poisson point process in arbitrary spatial dimension with KZM density.
arXiv Detail & Related papers (2022-02-23T19:00:06Z) - Convex Analysis of the Mean Field Langevin Dynamics [49.66486092259375]
convergence rate analysis of the mean field Langevin dynamics is presented.
$p_q$ associated with the dynamics allows us to develop a convergence theory parallel to classical results in convex optimization.
arXiv Detail & Related papers (2022-01-25T17:13:56Z) - Quantum correlations, entanglement spectrum and coherence of
two-particle reduced density matrix in the Extended Hubbard Model [62.997667081978825]
We study the ground state properties of the one-dimensional extended Hubbard model at half-filling.
In particular, in the superconducting region, we obtain that the entanglement spectrum signals a transition between a dominant singlet (SS) to triplet (TS) pairing ordering in the system.
arXiv Detail & Related papers (2021-10-29T21:02:24Z) - Hessian-Free High-Resolution Nesterov Acceleration for Sampling [55.498092486970364]
Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed.
This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods.
arXiv Detail & Related papers (2020-06-16T15:07:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.