Applications of Koopman Mode Analysis to Neural Networks
- URL: http://arxiv.org/abs/2006.11765v1
- Date: Sun, 21 Jun 2020 11:00:04 GMT
- Title: Applications of Koopman Mode Analysis to Neural Networks
- Authors: Iva Manojlovi\'c, Maria Fonoberova, Ryan Mohr, Aleksandr
Andrej\v{c}uk, Zlatko Drma\v{c}, Yannis Kevrekidis, Igor Mezi\'c
- Abstract summary: We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space.
We show how the Koopman spectrum can be used to determine the number of layers required for the architecture.
We also show how using Koopman modes we can selectively prune the network to speed up the training procedure.
- Score: 52.77024349608834
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the training process of a neural network as a dynamical system
acting on the high-dimensional weight space. Each epoch is an application of
the map induced by the optimization algorithm and the loss function. Using this
induced map, we can apply observables on the weight space and measure their
evolution. The evolution of the observables are given by the Koopman operator
associated with the induced dynamical system. We use the spectrum and modes of
the Koopman operator to realize the above objectives. Our methods can help to,
a priori, determine the network depth; determine if we have a bad
initialization of the network weights, allowing a restart before training too
long; speeding up the training time. Additionally, our methods help enable
noise rejection and improve robustness. We show how the Koopman spectrum can be
used to determine the number of layers required for the architecture.
Additionally, we show how we can elucidate the convergence versus
non-convergence of the training process by monitoring the spectrum, in
particular, how the existence of eigenvalues clustering around 1 determines
when to terminate the learning process. We also show how using Koopman modes we
can selectively prune the network to speed up the training procedure. Finally,
we show that incorporating loss functions based on negative Sobolev norms can
allow for the reconstruction of a multi-scale signal polluted by very large
amounts of noise.
Related papers
- Gradient-free training of recurrent neural networks [3.272216546040443]
We introduce a computational approach to construct all weights and biases of a recurrent neural network without using gradient-based methods.
The approach is based on a combination of random feature networks and Koopman operator theory for dynamical systems.
In computational experiments on time series, forecasting for chaotic dynamical systems, and control problems, we observe that the training time and forecasting accuracy of the recurrent neural networks we construct are improved.
arXiv Detail & Related papers (2024-10-30T21:24:34Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Credit Assignment for Trained Neural Networks Based on Koopman Operator
Theory [3.130109807128472]
Credit assignment problem of neural networks refers to evaluating the credit of each network component to the final outputs.
This paper presents an alternative perspective of linear dynamics on dealing with the credit assignment problem for trained neural networks.
Experiments conducted on typical neural networks demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-02T06:34:27Z) - Neural Maximum A Posteriori Estimation on Unpaired Data for Motion
Deblurring [87.97330195531029]
We propose a Neural Maximum A Posteriori (NeurMAP) estimation framework for training neural networks to recover blind motion information and sharp content from unpaired data.
The proposed NeurMAP is an approach to existing deblurring neural networks, and is the first framework that enables training image deblurring networks on unpaired datasets.
arXiv Detail & Related papers (2022-04-26T08:09:47Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Denoising IMU Gyroscopes with Deep Learning for Open-Loop Attitude
Estimation [0.0]
This paper proposes a learning method for denoising gyroscopes of Inertial Measurement Units (IMUs) using ground truth data.
The obtained algorithm outperforms the state-of-the-art on the (unseen) test sequences.
arXiv Detail & Related papers (2020-02-25T08:04:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.