Related papers: The geometry of integration in text classification RNNs

The geometry of integration in text classification RNNs

URL: http://arxiv.org/abs/2010.15114v2
Date: Fri, 3 Jun 2022 17:05:49 GMT
Title: The geometry of integration in text classification RNNs
Authors: Kyle Aitken, Vinay V. Ramasesh, Ankush Garg, Yuan Cao, David Sussillo, Niru Maheswaranathan
Abstract summary: We study recurrent networks trained on a battery of both natural and synthetic text classification tasks. We find the dynamics of these trained RNNs to be both interpretable and low-dimensional. Our observations span multiple architectures and datasets, reflecting a common mechanism RNNs employ to perform text classification.
Score: 20.76659136484842
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the widespread application of recurrent neural networks (RNNs) across a variety of tasks, a unified understanding of how RNNs solve these tasks remains elusive. In particular, it is unclear what dynamical patterns arise in trained RNNs, and how those patterns depend on the training dataset or task. This work addresses these questions in the context of a specific natural language processing task: text classification. Using tools from dynamical systems analysis, we study recurrent networks trained on a battery of both natural and synthetic text classification tasks. We find the dynamics of these trained RNNs to be both interpretable and low-dimensional. Specifically, across architectures and datasets, RNNs accumulate evidence for each class as they process the text, using a low-dimensional attractor manifold as the underlying mechanism. Moreover, the dimensionality and geometry of the attractor manifold are determined by the structure of the training dataset; in particular, we describe how simple word-count statistics computed on the training dataset can be used to predict these properties. Our observations span multiple architectures and datasets, reflecting a common mechanism RNNs employ to perform text classification. To the degree that integration of evidence towards a decision is a common computational primitive, this work lays the foundation for using dynamical systems techniques to study the inner workings of RNNs.

Related papers

Multiway Multislice PHATE: Visualizing Hidden Dynamics of RNNs through Training [6.326396282553267]
Recurrent neural networks (RNNs) are a widely used tool for sequential data analysis, however, they are still often seen as black boxes of computation. Here, we present Multiway Multislice PHATE (MM-PHATE), a novel method for visualizing the evolution of RNNs' hidden states.
arXiv Detail & Related papers (2024-06-04T05:05:27Z)
Deep Neural Networks via Complex Network Theory: a Perspective [3.1023851130450684]
Deep Neural Networks (DNNs) can be represented as graphs whose links and vertices iteratively process data and solve tasks sub-optimally. Complex Network Theory (CNT), merging statistical physics with graph theory, provides a method for interpreting neural networks by analysing their weights and neuron structures. In this work, we extend the existing CNT metrics with measures that sample from the DNNs' training distribution, shifting from a purely topological analysis to one that connects with the interpretability of deep learning.
arXiv Detail & Related papers (2024-04-17T08:42:42Z)
Topological Representations of Heterogeneous Learning Dynamics of Recurrent Spiking Neural Networks [16.60622265961373]
Spiking Neural Networks (SNNs) have become an essential paradigm in neuroscience and artificial intelligence. Recent advances in literature have studied the network representations of deep neural networks.
arXiv Detail & Related papers (2024-03-19T05:37:26Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Disentangling Structured Components: Towards Adaptive, Interpretable and Scalable Time Series Forecasting [52.47493322446537]
We develop a adaptive, interpretable and scalable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns. SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns. Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2023-05-22T13:39:44Z)
Parallel Neural Networks in Golang [0.0]
This paper describes the design and implementation of parallel neural networks (PNNs) with the novel programming language Golang. Golang and its inherent parallelization support proved very well for parallel neural network simulation by considerable decreased processing times compared to sequential variants.
arXiv Detail & Related papers (2023-04-19T11:56:36Z)
Learning Deep Morphological Networks with Neural Architecture Search [19.731352645511052]
We propose a method based on meta-learning to incorporate morphological operators into Deep Neural Networks. The learned architecture demonstrates how our novel morphological operations significantly increase DNN performance on various tasks.
arXiv Detail & Related papers (2021-06-14T19:19:48Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
NSL: Hybrid Interpretable Learning From Noisy Raw Data [66.15862011405882]
This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data. NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics. We demonstrate that NSL is able to learn robust rules from MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines.
arXiv Detail & Related papers (2020-12-09T13:02:44Z)
Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training. The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z)
Recurrent Neural Network Learning of Performance and Intrinsic Population Dynamics from Sparse Neural Data [77.92736596690297]
We introduce a novel training strategy that allows learning not only the input-output behavior of an RNN but also its internal network dynamics. We test the proposed method by training an RNN to simultaneously reproduce internal dynamics and output signals of a physiologically-inspired neural model. Remarkably, we show that the reproduction of the internal dynamics is successful even when the training algorithm relies on the activities of a small subset of neurons.
arXiv Detail & Related papers (2020-05-05T14:16:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.