Related papers: Extraction of nonlinearity in neural networks with Koopman operator

Extraction of nonlinearity in neural networks with Koopman operator

URL: http://arxiv.org/abs/2402.11740v3
Date: Thu, 27 Jun 2024 01:49:19 GMT
Title: Extraction of nonlinearity in neural networks with Koopman operator
Authors: Naoki Sugishita, Kayo Kinjo, Jun Ohkubo,
Abstract summary: We investigate the degree to which the nonlinearity of the neural network is essential. We employ the Koopman operator, extended dynamic mode decomposition, and the tensor-train format.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Nonlinearity plays a crucial role in deep neural networks. In this paper, we investigate the degree to which the nonlinearity of the neural network is essential. For this purpose, we employ the Koopman operator, extended dynamic mode decomposition, and the tensor-train format. The Koopman operator approach has been recently developed in physics and nonlinear sciences; the Koopman operator deals with the time evolution in the observable space instead of the state space. Since we can replace the nonlinearity in the state space with the linearity in the observable space, it is a hopeful candidate for understanding complex behavior in nonlinear systems. Here, we analyze learned neural networks for the classification problems. As a result, the replacement of the nonlinear middle layers with the Koopman matrix yields enough accuracy in numerical experiments. In addition, we confirm that the pruning of the Koopman matrix gives sufficient accuracy even at high compression ratios. These results indicate the possibility of extracting some features in the neural networks with the Koopman operator approach.

Related papers

Invertible Koopman neural operator for data-driven modeling of partial differential equations [15.007354910932039]
Invertible Koopman Neural Operator (IKNO) is a novel data-driven modeling approach inspired by the Koopman operator theory and neural operator. IKNO parameterizes observable function and its inverse simultaneously under the same learnable parameters.
arXiv Detail & Related papers (2025-03-25T14:43:53Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Koopman operator learning using invertible neural networks [0.6846628460229516]
In Koopman operator theory, a finite-dimensional nonlinear system is transformed into an infinite but linear system using a set of observable functions. Current methodologies tend to disregard the importance of the invertibility of observable functions, which leads to inaccurate results. We propose FlowDMD, aka Flow-based Dynamic Mode Decomposition, that utilizes the Coupling Flow Invertible Neural Network (CF-INN) framework.
arXiv Detail & Related papers (2023-06-30T04:26:46Z)
Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise. We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z)
Physics-Informed Koopman Network [14.203407036091555]
We propose a novel architecture inspired by physics-informed neural networks to represent Koopman operators. We demonstrate that it not only reduces the need of large training data-sets, but also maintains high effectiveness in approximating Koopman eigenfunctions.
arXiv Detail & Related papers (2022-11-17T08:57:57Z)
Fast Adaptation with Linearized Neural Networks [35.43406281230279]
We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of the full network functions. Inspired by this finding, we propose a technique for embedding these inductive biases into Gaussian processes through a kernel designed from the Jacobian of the network. In this setting, domain adaptation takes the form of interpretable posterior inference, with accompanying uncertainty estimation.
arXiv Detail & Related papers (2021-03-02T03:23:03Z)
CKNet: A Convolutional Neural Network Based on Koopman Operator for Modeling Latent Dynamics from Pixels [5.286010070038216]
We present a convolutional neural network (CNN) based on the Koopman operator (CKNet) to identify the latent dynamics from raw pixels. Experiments show that identified dynamics with 32-dim can predict validly 120 steps and generate clear images.
arXiv Detail & Related papers (2021-02-19T23:29:08Z)
Towards Understanding Hierarchical Learning: Benefits of Neural Representations [160.33479656108926]
In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks. We show that neural representation can achieve improved sample complexities compared with the raw input. Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.
arXiv Detail & Related papers (2020-06-24T02:44:54Z)
Applications of Koopman Mode Analysis to Neural Networks [52.77024349608834]
We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space. We show how the Koopman spectrum can be used to determine the number of layers required for the architecture. We also show how using Koopman modes we can selectively prune the network to speed up the training procedure.
arXiv Detail & Related papers (2020-06-21T11:00:04Z)
Optimizing Neural Networks via Koopman Operator Theory [6.09170287691728]
Koopman operator theory was recently shown to be intimately connected with neural network theory. In this work we take the first steps in making use of this connection. We show that Koopman operator theory methods allow predictions of weights and biases of feed weights over a non-trivial range of training time.
arXiv Detail & Related papers (2020-06-03T16:23:07Z)
On the distance between two neural networks and the stability of learning [59.62047284234815]
This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions. The analysis leads to a new distance function called deep relative trust and a descent lemma for neural networks.
arXiv Detail & Related papers (2020-02-09T19:18:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.