Related papers: PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs

PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs

URL: http://arxiv.org/abs/2312.09793v1
Date: Fri, 15 Dec 2023 13:49:29 GMT
Title: PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs
Authors: Deividas Eringis, John Leth, Zheng-Hua Tan, Rafal Wisniewski, Mihaly Petreczky
Abstract summary: We derive a PAC-Bayes bound on the generalisation gap for a special class of discrete-time non-linear dynamical systems. The proposed bound converges to zero as the dataset size increases. Unlike other available bounds the derived bound holds for non i.i.d. data (time-series) and it does not grow with the number of steps of the RNN.
Score: 11.338419403452239
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we derive a PAC-Bayes bound on the generalisation gap, in a supervised time-series setting for a special class of discrete-time non-linear dynamical systems. This class includes stable recurrent neural networks (RNN), and the motivation for this work was its application to RNNs. In order to achieve the results, we impose some stability constraints, on the allowed models. Here, stability is understood in the sense of dynamical systems. For RNNs, these stability conditions can be expressed in terms of conditions on the weights. We assume the processes involved are essentially bounded and the loss functions are Lipschitz. The proposed bound on the generalisation gap depends on the mixing coefficient of the data distribution, and the essential supremum of the data. Furthermore, the bound converges to zero as the dataset size increases. In this paper, we 1) formalize the learning problem, 2) derive a PAC-Bayesian error bound for such systems, 3) discuss various consequences of this error bound, and 4) show an illustrative example, with discussions on computing the proposed bound. Unlike other available bounds the derived bound holds for non i.i.d. data (time-series) and it does not grow with the number of steps of the RNN.

Related papers

Generalization and Risk Bounds for Recurrent Neural Networks [3.0638061480679912]
We establish a new generalization error bound for vanilla RNNs. We provide a unified framework to calculate the Rademacher complexity that can be applied to a variety of loss functions.
arXiv Detail & Related papers (2024-11-05T03:49:06Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Learning Discretized Neural Networks under Ricci Flow [51.36292559262042]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations. DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z)
Integrating Random Effects in Deep Neural Networks [4.860671253873579]
We propose to use the mixed models framework to handle correlated data in deep neural networks. By treating the effects underlying the correlation structure as random effects, mixed models are able to avoid overfitted parameter estimates. Our approach which we call LMMNN is demonstrated to improve performance over natural competitors in various correlation scenarios.
arXiv Detail & Related papers (2022-06-07T14:02:24Z)
Robust Learning of Physics Informed Neural Networks [2.86989372262348]
Physics-informed Neural Networks (PINNs) have been shown to be effective in solving partial differential equations. This paper shows that a PINN can be sensitive to errors in training data and overfit itself in dynamically propagating these errors over the domain of the solution of the PDE.
arXiv Detail & Related papers (2021-10-26T00:10:57Z)
How to train RNNs on chaotic data? [7.276372008305615]
Recurrent neural networks (RNNs) are wide-spread machine learning tools for modeling sequential and time series data. They are notoriously hard to train because their loss gradients backpropagated in time tend to saturate or diverge during training. Here we offer a comprehensive theoretical treatment of this problem by relating the loss gradients during RNN training to the Lyapunov spectrum of RNN-generated orbits.
arXiv Detail & Related papers (2021-10-14T09:07:42Z)
Training Stable Graph Neural Networks Through Constrained Learning [116.03137405192356]
Graph Neural Networks (GNNs) rely on graph convolutions to learn features from network data. GNNs are stable to different types of perturbations of the underlying graph, a property that they inherit from graph filters. We propose a novel constrained learning approach by imposing a constraint on the stability condition of the GNN within a perturbation of choice.
arXiv Detail & Related papers (2021-10-07T15:54:42Z)
Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption. They can suffer from ill-posedness and convergence instability. This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z)
Neural ODE Processes [64.10282200111983]
We introduce Neural ODE Processes (NDPs), a new class of processes determined by a distribution over Neural ODEs. We show that our model can successfully capture the dynamics of low-dimensional systems from just a few data-points.
arXiv Detail & Related papers (2021-03-23T09:32:06Z)
Probabilistic Numeric Convolutional Neural Networks [80.42120128330411]
Continuous input signals like images and time series that are irregularly sampled or have missing values are challenging for existing deep learning methods. We propose Probabilistic Convolutional Neural Networks which represent features as Gaussian processes (GPs) We then define a convolutional layer as the evolution of a PDE defined on this GP, followed by a nonlinearity. In experiments we show that our approach yields a $3times$ reduction of error from the previous state of the art on the SuperPixel-MNIST dataset and competitive performance on the medical time2012 dataset PhysioNet.
arXiv Detail & Related papers (2020-10-21T10:08:21Z)
Stochastic Graph Neural Networks [123.39024384275054]
Graph neural networks (GNNs) model nonlinear representations in graph data with applications in distributed agent coordination, control, and planning. Current GNN architectures assume ideal scenarios and ignore link fluctuations that occur due to environment, human factors, or external attacks. In these situations, the GNN fails to address its distributed task if the topological randomness is not considered accordingly.
arXiv Detail & Related papers (2020-06-04T08:00:00Z)
A Convex Parameterization of Robust Recurrent Neural Networks [3.2872586139884623]
Recurrent neural networks (RNNs) are a class of nonlinear dynamical systems often used to model sequence-to-sequence maps. We formulate convex sets of RNNs with stability and robustness guarantees.
arXiv Detail & Related papers (2020-04-11T03:12:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.