Related papers: A Koopman Approach to Understanding Sequence Neural Models

A Koopman Approach to Understanding Sequence Neural Models

URL: http://arxiv.org/abs/2102.07824v1
Date: Mon, 15 Feb 2021 20:05:11 GMT
Title: A Koopman Approach to Understanding Sequence Neural Models
Authors: Ilan Naiman and Omri Azencot
Abstract summary: We introduce a new approach to understanding trained sequence neural models: the Koopman Analysis of Neural Networks (KANN) method. Motivated by the relation between time-series models and self-maps, we compute approximate Koopman operators that encode well the latent dynamics. Our results extend across tasks and architectures as we demonstrate for the copy problem, ECG classification and sentiment analysis tasks.
Score: 2.8783296093434148
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a new approach to understanding trained sequence neural models: the Koopman Analysis of Neural Networks (KANN) method. Motivated by the relation between time-series models and self-maps, we compute approximate Koopman operators that encode well the latent dynamics. Unlike other existing methods whose applicability is limited, our framework is global, and it has only weak constraints over the inputs. Moreover, the Koopman operator is linear, and it is related to a rich mathematical theory. Thus, we can use tools and insights from linear analysis and Koopman Theory in our study. For instance, we show that the operator eigendecomposition is instrumental in exploring the dominant features of the network. Our results extend across tasks and architectures as we demonstrate for the copy problem, and ECG classification and sentiment analysis tasks.

Related papers

SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting [24.672521835707446]
We establish a connection between Koopman operator approximation and linear Recurrent Neural Networks (RNNs)<n>We present SKOLR, which integrates a learnable spectral decomposition of the input signal with a multilayer perceptron (MLP) as the measurement functions.<n> Numerical experiments on various forecasting benchmarks and dynamical systems show that this streamlined, Koopman-theory-based design delivers exceptional performance.
arXiv Detail & Related papers (2025-06-17T02:11:06Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
SoK: On Finding Common Ground in Loss Landscapes Using Deep Model Merging Techniques [4.013324399289249]
We present a novel taxonomy of model merging techniques organized by their core algorithmic principles. We distill repeated empirical observations from the literature in these fields into characterizations of four major aspects of loss landscape geometry.
arXiv Detail & Related papers (2024-10-16T18:14:05Z)
Reasoning with trees: interpreting CNNs using hierarchies [3.6763102409647526]
We introduce a framework that uses hierarchical segmentation techniques for faithful and interpretable explanations of Convolutional Neural Networks (CNNs) Our method constructs model-based hierarchical segmentations that maintain the model's reasoning fidelity. Experiments show that our framework, xAiTrees, delivers highly interpretable and faithful model explanations.
arXiv Detail & Related papers (2024-06-19T06:45:19Z)
Operator Learning Meets Numerical Analysis: Improving Neural Networks through Iterative Methods [2.226971382808806]
We develop a theoretical framework grounded in iterative methods for operator equations. We demonstrate that popular architectures, such as diffusion models and AlphaFold, inherently employ iterative operator learning. Our work aims to enhance the understanding of deep learning by merging insights from numerical analysis.
arXiv Detail & Related papers (2023-10-02T20:25:36Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
A Meta-Learning Approach for Training Explainable Graph Neural Networks [10.11960004698409]
We propose a meta-learning framework for improving the level of explainability of a GNN directly at training time. Our framework jointly trains a model to solve the original task, e.g., node classification, and to provide easily processable outputs for downstream algorithms. Our model-agnostic approach can improve the explanations produced for different GNN architectures and use any instance-based explainer to drive this process.
arXiv Detail & Related papers (2021-09-20T11:09:10Z)
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models. We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges. We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Applications of Koopman Mode Analysis to Neural Networks [52.77024349608834]
We consider the training process of a neural network as a dynamical system acting on the high-dimensional weight space. We show how the Koopman spectrum can be used to determine the number of layers required for the architecture. We also show how using Koopman modes we can selectively prune the network to speed up the training procedure.
arXiv Detail & Related papers (2020-06-21T11:00:04Z)
Optimizing Neural Networks via Koopman Operator Theory [6.09170287691728]
Koopman operator theory was recently shown to be intimately connected with neural network theory. In this work we take the first steps in making use of this connection. We show that Koopman operator theory methods allow predictions of weights and biases of feed weights over a non-trivial range of training time.
arXiv Detail & Related papers (2020-06-03T16:23:07Z)
Forecasting Sequential Data using Consistent Koopman Autoencoders [52.209416711500005]
A new class of physics-based methods related to Koopman theory has been introduced, offering an alternative for processing nonlinear dynamical systems. We propose a novel Consistent Koopman Autoencoder model which, unlike the majority of existing work, leverages the forward and backward dynamics. Key to our approach is a new analysis which explores the interplay between consistent dynamics and their associated Koopman operators.
arXiv Detail & Related papers (2020-03-04T18:24:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.