Generalized EXTRA stochastic gradient Langevin dynamics
- URL: http://arxiv.org/abs/2412.01993v1
- Date: Mon, 02 Dec 2024 21:57:30 GMT
- Title: Generalized EXTRA stochastic gradient Langevin dynamics
- Authors: Mert Gurbuzbalaban, Mohammad Rafiqul Islam, Xiaoyu Wang, Lingjiong Zhu,
- Abstract summary: Langevin algorithms are popular Markov Chain Monte Carlo methods for Bayesian learning.
Their versions such as Langevin dynamics (SGLD) allow iterative learning based on randomly sampled mini-batches.
When data is decentralized across a network of agents subject to communication and privacy constraints, standard SGLD algorithms cannot be applied.
- Score: 11.382163777108385
- License:
- Abstract: Langevin algorithms are popular Markov Chain Monte Carlo methods for Bayesian learning, particularly when the aim is to sample from the posterior distribution of a parametric model, given the input data and the prior distribution over the model parameters. Their stochastic versions such as stochastic gradient Langevin dynamics (SGLD) allow iterative learning based on randomly sampled mini-batches of large datasets and are scalable to large datasets. However, when data is decentralized across a network of agents subject to communication and privacy constraints, standard SGLD algorithms cannot be applied. Instead, we employ decentralized SGLD (DE-SGLD) algorithms, where Bayesian learning is performed collaboratively by a network of agents without sharing individual data. Nonetheless, existing DE-SGLD algorithms induce a bias at every agent that can negatively impact performance; this bias persists even when using full batches and is attributable to network effects. Motivated by the EXTRA algorithm and its generalizations for decentralized optimization, we propose the generalized EXTRA stochastic gradient Langevin dynamics, which eliminates this bias in the full-batch setting. Moreover, we show that, in the mini-batch setting, our algorithm provides performance bounds that significantly improve upon those of standard DE-SGLD algorithms in the literature. Our numerical results also demonstrate the efficiency of the proposed approach.
Related papers
- Stability and Generalization for Distributed SGDA [70.97400503482353]
We propose the stability-based generalization analytical framework for Distributed-SGDA.
We conduct a comprehensive analysis of stability error, generalization gap, and population risk across different metrics.
Our theoretical results reveal the trade-off between the generalization gap and optimization error.
arXiv Detail & Related papers (2024-11-14T11:16:32Z) - Stability and Generalization of the Decentralized Stochastic Gradient
Descent Ascent Algorithm [80.94861441583275]
We investigate the complexity of the generalization bound of the decentralized gradient descent (D-SGDA) algorithm.
Our results analyze the impact of different top factors on the generalization of D-SGDA.
We also balance it with the generalization to obtain the optimal convex-concave setting.
arXiv Detail & Related papers (2023-10-31T11:27:01Z) - A Dataset Fusion Algorithm for Generalised Anomaly Detection in
Homogeneous Periodic Time Series Datasets [0.0]
"Dataset Fusion" is an algorithm for fusing periodic signals from multiple homogeneous datasets into a single dataset.
The proposed approach significantly outperforms conventional training approaches with an Average F1 score of 0.879.
Results show that using only 6.25% of the training data, translating to a 93.7% reduction in computational power, results in a mere 4.04% decrease in performance.
arXiv Detail & Related papers (2023-05-14T16:24:09Z) - Joint Graph Learning and Model Fitting in Laplacian Regularized
Stratified Models [5.933030735757292]
Laplacian regularized stratified models (LRSM) are models that utilize the explicit or implicit network structure of the sub-problems.
This paper shows the importance and sensitivity of graph weights in LRSM, and provably show that the sensitivity can be arbitrarily large.
We propose a generic approach to jointly learn the graph while fitting the model parameters by solving a single optimization problem.
arXiv Detail & Related papers (2023-05-04T06:06:29Z) - LD-GAN: Low-Dimensional Generative Adversarial Network for Spectral
Image Generation with Variance Regularization [72.4394510913927]
Deep learning methods are state-of-the-art for spectral image (SI) computational tasks.
GANs enable diverse augmentation by learning and sampling from the data distribution.
GAN-based SI generation is challenging since the high-dimensionality nature of this kind of data hinders the convergence of the GAN training yielding to suboptimal generation.
We propose a statistical regularization to control the low-dimensional representation variance for the autoencoder training and to achieve high diversity of samples generated with the GAN.
arXiv Detail & Related papers (2023-04-29T00:25:02Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Efficiency Ordering of Stochastic Gradient Descent [9.634481296779057]
We consider the gradient descent (SGD) algorithm driven by a general sampling sequence, including i.i.i.d noise and random walk on an arbitrary graph.
We employ the notion of efficiency ordering', a well-analyzed tool for comparing the performance of Markov Chain Monte Carlo samplers.
arXiv Detail & Related papers (2022-09-15T16:50:55Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - A Contour Stochastic Gradient Langevin Dynamics Algorithm for
Simulations of Multi-modal Distributions [17.14287157979558]
We propose an adaptively weighted gradient Langevin dynamics (SGLD) for learning in big data statistics.
The proposed algorithm is tested on benchmark datasets including CIFAR100.
arXiv Detail & Related papers (2020-10-19T19:20:47Z) - Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian
Monte Carlo [8.94392435424862]
Decentralized SGLD (DE-SGLD) and Decentralized SGHMC (DE-SGHMC) are algorithms for scaleable Bayesian inference in the decentralized setting for large datasets.
We show that when the posterior distribution is strongly log-concave and smooth, the iterates of these algorithms converge linearly to a neighborhood of the target distribution in the 2-Wasserstein distance.
arXiv Detail & Related papers (2020-07-01T16:26:00Z) - Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via
Non-uniform Subsampling of Gradients [54.90670513852325]
We propose a non-uniform subsampling scheme to improve the sampling accuracy.
EWSG is designed so that a non-uniform gradient-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method.
In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index.
arXiv Detail & Related papers (2020-02-20T18:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.