Internal representation dynamics and geometry in recurrent neural
networks
- URL: http://arxiv.org/abs/2001.03255v2
- Date: Tue, 14 Jan 2020 14:23:02 GMT
- Title: Internal representation dynamics and geometry in recurrent neural
networks
- Authors: Stefan Horoi, Guillaume Lajoie and Guy Wolf
- Abstract summary: We show how a vanilla RNN implements a simple classification task by analysing the dynamics of the network.
We find that early internal representations are evocative of the real labels of the data but this information is not directly accessible to the output layer.
- Score: 10.016265742591674
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The efficiency of recurrent neural networks (RNNs) in dealing with sequential
data has long been established. However, unlike deep, and convolution networks
where we can attribute the recognition of a certain feature to every layer, it
is unclear what "sub-task" a single recurrent step or layer accomplishes. Our
work seeks to shed light onto how a vanilla RNN implements a simple
classification task by analysing the dynamics of the network and the geometric
properties of its hidden states. We find that early internal representations
are evocative of the real labels of the data but this information is not
directly accessible to the output layer. Furthermore the network's dynamics and
the sequence length are both critical to correct classifications even when
there is no additional task relevant information provided.
Related papers
- Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH)
When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction.
These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z) - Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Task structure and nonlinearity jointly determine learned
representational geometry [0.0]
We show that Tanh networks tend to learn representations that reflect the structure of the target outputs, while ReLU networks retain more information about the structure of the raw inputs.
Our findings shed light on the interplay between input-output geometry, nonlinearity, and learned representations in neural networks.
arXiv Detail & Related papers (2024-01-24T16:14:38Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks.
We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z) - Topological Uncertainty: Monitoring trained neural networks through
persistence of activation graphs [0.9786690381850356]
In industrial applications, data coming from an open-world setting might widely differ from the benchmark datasets on which a network was trained.
We develop a method to monitor trained neural networks based on the topological properties of their activation graphs.
arXiv Detail & Related papers (2021-05-07T14:16:03Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z) - Finding trainable sparse networks through Neural Tangent Transfer [16.092248433189816]
In deep learning, trainable sparse networks that perform well on a specific task are usually constructed using label-dependent pruning criteria.
In this article, we introduce Neural Tangent Transfer, a method that instead finds trainable sparse networks in a label-free manner.
arXiv Detail & Related papers (2020-06-15T08:58:01Z) - Modeling Dynamic Heterogeneous Network for Link Prediction using
Hierarchical Attention with Temporal RNN [16.362525151483084]
We propose a novel dynamic heterogeneous network embedding method, termed as DyHATR.
It uses hierarchical attention to learn heterogeneous information and incorporates recurrent neural networks with temporal attention to capture evolutionary patterns.
We benchmark our method on four real-world datasets for the task of link prediction.
arXiv Detail & Related papers (2020-04-01T17:16:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.