Training Hamiltonian neural networks without backpropagation
- URL: http://arxiv.org/abs/2411.17511v1
- Date: Tue, 26 Nov 2024 15:22:30 GMT
- Title: Training Hamiltonian neural networks without backpropagation
- Authors: Atamert Rahma, Chinmay Datar, Felix Dietrich,
- Abstract summary: We present a backpropagation-free algorithm to accelerate the training of neural networks for approximating Hamiltonian systems.
We show that our approach is more than 100 times faster with CPUs than the traditionally trained Hamiltonian Neural Networks.
- Score: 0.0
- License:
- Abstract: Neural networks that synergistically integrate data and physical laws offer great promise in modeling dynamical systems. However, iterative gradient-based optimization of network parameters is often computationally expensive and suffers from slow convergence. In this work, we present a backpropagation-free algorithm to accelerate the training of neural networks for approximating Hamiltonian systems through data-agnostic and data-driven algorithms. We empirically show that data-driven sampling of the network parameters outperforms data-agnostic sampling or the traditional gradient-based iterative optimization of the network parameters when approximating functions with steep gradients or wide input domains. We demonstrate that our approach is more than 100 times faster with CPUs than the traditionally trained Hamiltonian Neural Networks using gradient-based iterative optimization and is more than four orders of magnitude accurate in chaotic examples, including the H\'enon-Heiles system.
Related papers
- From Fourier to Neural ODEs: Flow Matching for Modeling Complex Systems [20.006163951844357]
We propose a simulation-free framework for training neural ordinary differential equations (NODEs)
We employ the Fourier analysis to estimate temporal and potential high-order spatial gradients from noisy observational data.
Our approach outperforms state-of-the-art methods in terms of training time, dynamics prediction, and robustness.
arXiv Detail & Related papers (2024-05-19T13:15:23Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Simple initialization and parametrization of sinusoidal networks via
their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions.
We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis.
We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Acceleration techniques for optimization over trained neural network
ensembles [1.0323063834827415]
We study optimization problems where the objective function is modeled through feedforward neural networks with rectified linear unit activation.
We present a mixed-integer linear program based on existing popular big-$M$ formulations for optimizing over a single neural network.
arXiv Detail & Related papers (2021-12-13T20:50:54Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - Convergence rates for gradient descent in the training of
overparameterized artificial neural networks with biases [3.198144010381572]
In recent years, artificial neural networks have developed into a powerful tool for dealing with a multitude of problems for which classical solution approaches.
It is still unclear why randomly gradient descent algorithms reach their limits.
arXiv Detail & Related papers (2021-02-23T18:17:47Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Hyperparameter Optimization in Binary Communication Networks for
Neuromorphic Deployment [4.280642750854163]
Training neural networks for neuromorphic deployment is non-trivial.
We introduce a Bayesian approach for optimizing the hyper parameters of an algorithm for training binary communication networks that can be deployed to neuromorphic hardware.
We show that by optimizing the hyper parameters on this algorithm for each dataset, we can achieve improvements in accuracy over the previous state-of-the-art for this algorithm on each dataset.
arXiv Detail & Related papers (2020-04-21T01:15:45Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z) - On transfer learning of neural networks using bi-fidelity data for
uncertainty propagation [0.0]
We explore the application of transfer learning techniques using training data generated from both high- and low-fidelity models.
In the former approach, a neural network model mapping the inputs to the outputs of interest is trained based on the low-fidelity data.
The high-fidelity data is then used to adapt the parameters of the upper layer(s) of the low-fidelity network, or train a simpler neural network to map the output of the low-fidelity network to that of the high-fidelity model.
arXiv Detail & Related papers (2020-02-11T15:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.