Learning Stochastic Graph Neural Networks with Constrained Variance
- URL: http://arxiv.org/abs/2201.12611v1
- Date: Sat, 29 Jan 2022 15:55:58 GMT
- Title: Learning Stochastic Graph Neural Networks with Constrained Variance
- Authors: Zhan Gao and Elvin Isufi
- Abstract summary: graph neural networks (SGNNs) are information processing architectures that learn representations from data over random graphs.
We propose a variance-constrained optimization problem for SGNNs, balancing the expected performance and the deviation.
An alternating gradient-dual learning procedure is undertaken that solves the problem by updating the SGNN parameters with descent and the dual variable with ascent.
- Score: 18.32587282139282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic graph neural networks (SGNNs) are information processing
architectures that learn representations from data over random graphs. SGNNs
are trained with respect to the expected performance, which comes with no
guarantee about deviations of particular output realizations around the optimal
expectation. To overcome this issue, we propose a variance-constrained
optimization problem for SGNNs, balancing the expected performance and the
stochastic deviation. An alternating primal-dual learning procedure is
undertaken that solves the problem by updating the SGNN parameters with
gradient descent and the dual variable with gradient ascent. To characterize
the explicit effect of the variance-constrained learning, we conduct a
theoretical analysis on the variance of the SGNN output and identify a
trade-off between the stochastic robustness and the discrimination power. We
further analyze the duality gap of the variance-constrained optimization
problem and the converging behavior of the primal-dual learning procedure. The
former indicates the optimality loss induced by the dual transformation and the
latter characterizes the limiting error of the iterative algorithm, both of
which guarantee the performance of the variance-constrained learning. Through
numerical simulations, we corroborate our theoretical findings and observe a
strong expected performance with a controllable standard deviation.
Related papers
- Exploring End-to-end Differentiable Neural Charged Particle Tracking -- A Loss Landscape Perspective [0.0]
We propose an E2E differentiable decision-focused learning scheme for particle tracking.
We show that differentiable variations of discrete assignment operations allows for efficient network optimization.
We argue that E2E differentiability provides, besides the general availability of gradient information, an important tool for robust particle tracking to mitigate prediction instabilities.
arXiv Detail & Related papers (2024-07-18T11:42:58Z) - Neural Tangent Kernels Motivate Graph Neural Networks with
Cross-Covariance Graphs [94.44374472696272]
We investigate NTKs and alignment in the context of graph neural networks (GNNs)
Our results establish the theoretical guarantees on the optimality of the alignment for a two-layer GNN.
These guarantees are characterized by the graph shift operator being a function of the cross-covariance between the input and the output data.
arXiv Detail & Related papers (2023-10-16T19:54:21Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Stability and Generalization Analysis of Gradient Methods for Shallow
Neural Networks [59.142826407441106]
We study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability.
We consider gradient descent (GD) and gradient descent (SGD) to train SNNs, for both of which we develop consistent excess bounds.
arXiv Detail & Related papers (2022-09-19T18:48:00Z) - Analysis of Catastrophic Forgetting for Random Orthogonal Transformation
Tasks in the Overparameterized Regime [9.184987303791292]
We show that in permuted MNIST image classification tasks, the performance of multilayer perceptrons trained by vanilla gradient descent can be improved.
We provide a theoretical explanation of this effect by studying a qualitatively similar two-task linear regression problem.
We show that when a model is trained on the two tasks in sequence without any additional regularization, the risk gain on the first task is small.
arXiv Detail & Related papers (2022-06-01T18:04:33Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Accurate and Reliable Forecasting using Stochastic Differential
Equations [48.21369419647511]
It is critical yet challenging for deep learning models to properly characterize uncertainty that is pervasive in real-world environments.
This paper develops SDE-HNN to characterize the interaction between the predictive mean and variance of HNNs for accurate and reliable regression.
Experiments on the challenging datasets show that our method significantly outperforms the state-of-the-art baselines in terms of both predictive performance and uncertainty quantification.
arXiv Detail & Related papers (2021-03-28T04:18:11Z) - A Lagrangian Dual-based Theory-guided Deep Neural Network [0.0]
The Lagrangian dual-based TgNN (TgNN-LD) is proposed to improve the effectiveness of TgNN.
Experimental results demonstrate the superiority of the Lagrangian dual-based TgNN.
arXiv Detail & Related papers (2020-08-24T02:06:19Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.