Related papers: Learning Stochastic Graph Neural Networks with Constrained Variance

Learning Stochastic Graph Neural Networks with Constrained Variance

URL: http://arxiv.org/abs/2201.12611v1
Date: Sat, 29 Jan 2022 15:55:58 GMT
Title: Learning Stochastic Graph Neural Networks with Constrained Variance
Authors: Zhan Gao and Elvin Isufi
Abstract summary: graph neural networks (SGNNs) are information processing architectures that learn representations from data over random graphs. We propose a variance-constrained optimization problem for SGNNs, balancing the expected performance and the deviation. An alternating gradient-dual learning procedure is undertaken that solves the problem by updating the SGNN parameters with descent and the dual variable with ascent.
Score: 18.32587282139282
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Stochastic graph neural networks (SGNNs) are information processing architectures that learn representations from data over random graphs. SGNNs are trained with respect to the expected performance, which comes with no guarantee about deviations of particular output realizations around the optimal expectation. To overcome this issue, we propose a variance-constrained optimization problem for SGNNs, balancing the expected performance and the stochastic deviation. An alternating primal-dual learning procedure is undertaken that solves the problem by updating the SGNN parameters with gradient descent and the dual variable with gradient ascent. To characterize the explicit effect of the variance-constrained learning, we conduct a theoretical analysis on the variance of the SGNN output and identify a trade-off between the stochastic robustness and the discrimination power. We further analyze the duality gap of the variance-constrained optimization problem and the converging behavior of the primal-dual learning procedure. The former indicates the optimality loss induced by the dual transformation and the latter characterizes the limiting error of the iterative algorithm, both of which guarantee the performance of the variance-constrained learning. Through numerical simulations, we corroborate our theoretical findings and observe a strong expected performance with a controllable standard deviation.

Related papers

Evidential Physics-Informed Neural Networks [0.0]
We present a novel class of Physics-Informed Neural Networks that is formulated based on the principles of Evidential Deep Learning. We show how to apply our model to inverse problems involving 1D and 2D nonlinear differential equations.
arXiv Detail & Related papers (2025-01-27T10:01:10Z)
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems. We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z)
On the Convergence Analysis of Over-Parameterized Variational Autoencoders: A Neural Tangent Kernel Perspective [7.580900499231056]
Variational Auto-Encoders (VAEs) have emerged as powerful probabilistic models for generative tasks. This paper provides a mathematical proof of VAE under mild assumptions. We also establish a novel connection between the optimization problem faced by over-Eized SNNs and the Kernel Ridge (KRR) problem.
arXiv Detail & Related papers (2024-09-09T06:10:31Z)
Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks [3.680127959836384]
implicit gradient descent (IGD) outperforms the common gradient descent (GD) in handling certain multi-scale problems. We show that IGD converges a globally optimal solution at a linear convergence rate.
arXiv Detail & Related papers (2024-07-03T06:10:41Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Stability and Generalization Analysis of Gradient Methods for Shallow Neural Networks [59.142826407441106]
We study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability. We consider gradient descent (GD) and gradient descent (SGD) to train SNNs, for both of which we develop consistent excess bounds.
arXiv Detail & Related papers (2022-09-19T18:48:00Z)
Analysis of Catastrophic Forgetting for Random Orthogonal Transformation Tasks in the Overparameterized Regime [9.184987303791292]
We show that in permuted MNIST image classification tasks, the performance of multilayer perceptrons trained by vanilla gradient descent can be improved. We provide a theoretical explanation of this effect by studying a qualitatively similar two-task linear regression problem. We show that when a model is trained on the two tasks in sequence without any additional regularization, the risk gain on the first task is small.
arXiv Detail & Related papers (2022-06-01T18:04:33Z)
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure. We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
Accurate and Reliable Forecasting using Stochastic Differential Equations [48.21369419647511]
It is critical yet challenging for deep learning models to properly characterize uncertainty that is pervasive in real-world environments. This paper develops SDE-HNN to characterize the interaction between the predictive mean and variance of HNNs for accurate and reliable regression. Experiments on the challenging datasets show that our method significantly outperforms the state-of-the-art baselines in terms of both predictive performance and uncertainty quantification.
arXiv Detail & Related papers (2021-03-28T04:18:11Z)
A Lagrangian Dual-based Theory-guided Deep Neural Network [0.0]
The Lagrangian dual-based TgNN (TgNN-LD) is proposed to improve the effectiveness of TgNN. Experimental results demonstrate the superiority of the Lagrangian dual-based TgNN.
arXiv Detail & Related papers (2020-08-24T02:06:19Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.