A unified framework for Hamiltonian deep neural networks
- URL: http://arxiv.org/abs/2104.13166v1
- Date: Tue, 27 Apr 2021 13:20:24 GMT
- Title: A unified framework for Hamiltonian deep neural networks
- Authors: Clara L. Galimberti, Liang Xu, Giancarlo Ferrari Trecate
- Abstract summary: Training deep neural networks (DNNs) can be difficult due to vanishing/exploding gradients during weight optimization.
We propose a class of DNNs stemming from the time discretization of Hamiltonian systems.
The proposed Hamiltonian framework, besides encompassing existing networks inspired by marginally stable ODEs, allows one to derive new and more expressive architectures.
- Score: 3.0934684265555052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training deep neural networks (DNNs) can be difficult due to the occurrence
of vanishing/exploding gradients during weight optimization. To avoid this
problem, we propose a class of DNNs stemming from the time discretization of
Hamiltonian systems. The time-invariant version of the corresponding
Hamiltonian models enjoys marginal stability, a property that, as shown in
previous works and for specific DNNs architectures, can mitigate convergence to
zero or divergence of gradients. In the present paper, we formally study this
feature by deriving and analysing the backward gradient dynamics in continuous
time. The proposed Hamiltonian framework, besides encompassing existing
networks inspired by marginally stable ODEs, allows one to derive new and more
expressive architectures. The good performance of the novel DNNs is
demonstrated on benchmark classification problems, including digit recognition
using the MNIST dataset.
Related papers
- Symplectic Structure-Aware Hamiltonian (Graph) Embeddings [21.0714383010908]
Hamiltonian system-inspired GNNs have been proposed to address the dynamic nature of such embeddings.
We present Symplectic Structure-Aware Hamiltonian GNN (SAH-GNN), a novel approach that generalizes Hamiltonian dynamics for more flexible node feature updates.
arXiv Detail & Related papers (2023-09-09T22:27:38Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - Learning Trajectories of Hamiltonian Systems with Neural Networks [81.38804205212425]
We propose to enhance Hamiltonian neural networks with an estimation of a continuous-time trajectory of the modeled system.
We demonstrate that the proposed integration scheme works well for HNNs, especially with low sampling rates, noisy and irregular observations.
arXiv Detail & Related papers (2022-04-11T13:25:45Z) - Contracting Neural-Newton Solver [0.0]
We develop a recurrent NN simulation tool, termed the Contracting Neural-Newton Solver (CoNNS)
In this paper, we model the Newton solver at the heart of an implicit Runge-Kutta integrator as a contracting map iteratively seeking this fixed point.
We prove that successive passes through the NN are guaranteed to converge to a unique, fixed point.
arXiv Detail & Related papers (2021-06-04T15:14:12Z) - Hamiltonian Deep Neural Networks Guaranteeing Non-vanishing Gradients by
Design [2.752441514346229]
Vanishing and exploding gradients during weight optimization through backpropagation can be difficult to train.
We propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretization of continuous-time Hamiltonian systems.
Our main result is that a broad set of H-DNNs ensures non-vanishing gradients by design for an arbitrary network depth.
The good performance of H-DNNs is demonstrated on benchmark classification problems, including image classification with the MNIST dataset.
arXiv Detail & Related papers (2021-05-27T14:52:22Z) - UnICORNN: A recurrent model for learning very long time dependencies [0.0]
We propose a novel RNN architecture based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations.
The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem.
arXiv Detail & Related papers (2021-03-09T15:19:59Z) - Online Limited Memory Neural-Linear Bandits with Likelihood Matching [53.18698496031658]
We study neural-linear bandits for solving problems where both exploration and representation learning play an important role.
We propose a likelihood matching algorithm that is resilient to catastrophic forgetting and is completely online.
arXiv Detail & Related papers (2021-02-07T14:19:07Z) - Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks.
We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs)
At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z) - Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and
(gradient) stable architecture for learning long time dependencies [15.2292571922932]
We propose a novel architecture for recurrent neural networks.
Our proposed RNN is based on a time-discretization of a system of second-order ordinary differential equations.
Experiments show that the proposed RNN is comparable in performance to the state of the art on a variety of benchmarks.
arXiv Detail & Related papers (2020-10-02T12:35:04Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.