The geometry of integration in text classification RNNs
- URL: http://arxiv.org/abs/2010.15114v2
- Date: Fri, 3 Jun 2022 17:05:49 GMT
- Title: The geometry of integration in text classification RNNs
- Authors: Kyle Aitken, Vinay V. Ramasesh, Ankush Garg, Yuan Cao, David Sussillo,
Niru Maheswaranathan
- Abstract summary: We study recurrent networks trained on a battery of both natural and synthetic text classification tasks.
We find the dynamics of these trained RNNs to be both interpretable and low-dimensional.
Our observations span multiple architectures and datasets, reflecting a common mechanism RNNs employ to perform text classification.
- Score: 20.76659136484842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the widespread application of recurrent neural networks (RNNs) across
a variety of tasks, a unified understanding of how RNNs solve these tasks
remains elusive. In particular, it is unclear what dynamical patterns arise in
trained RNNs, and how those patterns depend on the training dataset or task.
This work addresses these questions in the context of a specific natural
language processing task: text classification. Using tools from dynamical
systems analysis, we study recurrent networks trained on a battery of both
natural and synthetic text classification tasks. We find the dynamics of these
trained RNNs to be both interpretable and low-dimensional. Specifically, across
architectures and datasets, RNNs accumulate evidence for each class as they
process the text, using a low-dimensional attractor manifold as the underlying
mechanism. Moreover, the dimensionality and geometry of the attractor manifold
are determined by the structure of the training dataset; in particular, we
describe how simple word-count statistics computed on the training dataset can
be used to predict these properties. Our observations span multiple
architectures and datasets, reflecting a common mechanism RNNs employ to
perform text classification. To the degree that integration of evidence towards
a decision is a common computational primitive, this work lays the foundation
for using dynamical systems techniques to study the inner workings of RNNs.
Related papers
- Multiway Multislice PHATE: Visualizing Hidden Dynamics of RNNs through Training [6.326396282553267]
Recurrent neural networks (RNNs) are a widely used tool for sequential data analysis, however, they are still often seen as black boxes of computation.
Here, we present Multiway Multislice PHATE (MM-PHATE), a novel method for visualizing the evolution of RNNs' hidden states.
arXiv Detail & Related papers (2024-06-04T05:05:27Z) - Deep Neural Networks via Complex Network Theory: a Perspective [3.1023851130450684]
Deep Neural Networks (DNNs) can be represented as graphs whose links and vertices iteratively process data and solve tasks sub-optimally. Complex Network Theory (CNT), merging statistical physics with graph theory, provides a method for interpreting neural networks by analysing their weights and neuron structures.
In this work, we extend the existing CNT metrics with measures that sample from the DNNs' training distribution, shifting from a purely topological analysis to one that connects with the interpretability of deep learning.
arXiv Detail & Related papers (2024-04-17T08:42:42Z) - Topological Representations of Heterogeneous Learning Dynamics of Recurrent Spiking Neural Networks [16.60622265961373]
Spiking Neural Networks (SNNs) have become an essential paradigm in neuroscience and artificial intelligence.
Recent advances in literature have studied the network representations of deep neural networks.
arXiv Detail & Related papers (2024-03-19T05:37:26Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Disentangling Structured Components: Towards Adaptive, Interpretable and
Scalable Time Series Forecasting [52.47493322446537]
We develop a adaptive, interpretable and scalable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns.
SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns.
Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2023-05-22T13:39:44Z) - Parallel Neural Networks in Golang [0.0]
This paper describes the design and implementation of parallel neural networks (PNNs) with the novel programming language Golang.
Golang and its inherent parallelization support proved very well for parallel neural network simulation by considerable decreased processing times compared to sequential variants.
arXiv Detail & Related papers (2023-04-19T11:56:36Z) - Learning Deep Morphological Networks with Neural Architecture Search [19.731352645511052]
We propose a method based on meta-learning to incorporate morphological operators into Deep Neural Networks.
The learned architecture demonstrates how our novel morphological operations significantly increase DNN performance on various tasks.
arXiv Detail & Related papers (2021-06-14T19:19:48Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - NSL: Hybrid Interpretable Learning From Noisy Raw Data [66.15862011405882]
This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data.
NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics.
We demonstrate that NSL is able to learn robust rules from MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines.
arXiv Detail & Related papers (2020-12-09T13:02:44Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z) - Recurrent Neural Network Learning of Performance and Intrinsic
Population Dynamics from Sparse Neural Data [77.92736596690297]
We introduce a novel training strategy that allows learning not only the input-output behavior of an RNN but also its internal network dynamics.
We test the proposed method by training an RNN to simultaneously reproduce internal dynamics and output signals of a physiologically-inspired neural model.
Remarkably, we show that the reproduction of the internal dynamics is successful even when the training algorithm relies on the activities of a small subset of neurons.
arXiv Detail & Related papers (2020-05-05T14:16:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.