Related papers: Applications of Koopman Mode Analysis to Neural Networks

Applications of Koopman Mode Analysis to Neural Networks

URL: http://arxiv.org/abs/2006.11765v1
Date: Sun, 21 Jun 2020 11:00:04 GMT
Title: Applications of Koopman Mode Analysis to Neural Networks
Authors: Iva Manojlovi\'c, Maria Fonoberova, Ryan Mohr, Aleksandr Andrej\v{c}uk, Zlatko Drma\v{c}, Yannis Kevrekidis, Igor Mezi\'c
Abstract summary: We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space. We show how the Koopman spectrum can be used to determine the number of layers required for the architecture. We also show how using Koopman modes we can selectively prune the network to speed up the training procedure.
Score: 52.77024349608834
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space. Each epoch is an application of the map induced by the optimization algorithm and the loss function. Using this induced map, we can apply observables on the weight space and measure their evolution. The evolution of the observables are given by the Koopman operator associated with the induced dynamical system. We use the spectrum and modes of the Koopman operator to realize the above objectives. Our methods can help to, a priori, determine the network depth; determine if we have a bad initialization of the network weights, allowing a restart before training too long; speeding up the training time. Additionally, our methods help enable noise rejection and improve robustness. We show how the Koopman spectrum can be used to determine the number of layers required for the architecture. Additionally, we show how we can elucidate the convergence versus non-convergence of the training process by monitoring the spectrum, in particular, how the existence of eigenvalues clustering around 1 determines when to terminate the learning process. We also show how using Koopman modes we can selectively prune the network to speed up the training procedure. Finally, we show that incorporating loss functions based on negative Sobolev norms can allow for the reconstruction of a multi-scale signal polluted by very large amounts of noise.

Related papers

Gradient-free training of recurrent neural networks [3.272216546040443]
We introduce a computational approach to construct all weights and biases of a recurrent neural network without using gradient-based methods. The approach is based on a combination of random feature networks and Koopman operator theory for dynamical systems. In computational experiments on time series, forecasting for chaotic dynamical systems, and control problems, we observe that the training time and forecasting accuracy of the recurrent neural networks we construct are improved.
arXiv Detail & Related papers (2024-10-30T21:24:34Z)
Tuning the Frequencies: Robust Training for Sinusoidal Neural Networks [1.5124439914522694]
We introduce a theoretical framework that explains the capacity property of sinusoidal networks. We show how its layer compositions produce a large number of new frequencies expressed as integer combinations of the input frequencies. Our method, referred to as TUNER, greatly improves the stability and convergence of sinusoidal INR training, leading to detailed reconstructions.
arXiv Detail & Related papers (2024-07-30T18:24:46Z)
Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process. We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Credit Assignment for Trained Neural Networks Based on Koopman Operator Theory [3.130109807128472]
Credit assignment problem of neural networks refers to evaluating the credit of each network component to the final outputs. This paper presents an alternative perspective of linear dynamics on dealing with the credit assignment problem for trained neural networks. Experiments conducted on typical neural networks demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-02T06:34:27Z)
Neural Maximum A Posteriori Estimation on Unpaired Data for Motion Deblurring [87.97330195531029]
We propose a Neural Maximum A Posteriori (NeurMAP) estimation framework for training neural networks to recover blind motion information and sharp content from unpaired data. The proposed NeurMAP is an approach to existing deblurring neural networks, and is the first framework that enables training image deblurring networks on unpaired datasets.
arXiv Detail & Related papers (2022-04-26T08:09:47Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Denoising IMU Gyroscopes with Deep Learning for Open-Loop Attitude Estimation [0.0]
This paper proposes a learning method for denoising gyroscopes of Inertial Measurement Units (IMUs) using ground truth data. The obtained algorithm outperforms the state-of-the-art on the (unseen) test sequences.
arXiv Detail & Related papers (2020-02-25T08:04:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.