Implicit N-grams Induced by Recurrence
- URL: http://arxiv.org/abs/2205.02724v1
- Date: Thu, 5 May 2022 15:53:46 GMT
- Title: Implicit N-grams Induced by Recurrence
- Authors: Xiaobing Sun and Wei Lu
- Abstract summary: We present a study that shows there actually exist some explainable components that reside within the hidden states.
We evaluated such extracted explainable features from trained RNNs on downstream sentiment analysis tasks and found they could be used to model interesting linguistic phenomena.
- Score: 10.053475465955794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although self-attention based models such as Transformers have achieved
remarkable successes on natural language processing (NLP) tasks, recent studies
reveal that they have limitations on modeling sequential transformations (Hahn,
2020), which may prompt re-examinations of recurrent neural networks (RNNs)
that demonstrated impressive results on handling sequential data. Despite many
prior attempts to interpret RNNs, their internal mechanisms have not been fully
understood, and the question on how exactly they capture sequential features
remains largely unclear. In this work, we present a study that shows there
actually exist some explainable components that reside within the hidden
states, which are reminiscent of the classical n-grams features. We evaluated
such extracted explainable features from trained RNNs on downstream sentiment
analysis tasks and found they could be used to model interesting linguistic
phenomena such as negation and intensification. Furthermore, we examined the
efficacy of using such n-gram components alone as encoders on tasks such as
sentiment analysis and language modeling, revealing they could be playing
important roles in contributing to the overall performance of RNNs. We hope our
findings could add interpretability to RNN architectures, and also provide
inspirations for proposing new architectures for sequential data.
Related papers
- Dynamical similarity analysis uniquely captures how computations develop in RNNs [3.037387520023979]
Recent findings show that some metrics respond to spurious signals, leading to misleading results.
We propose that compositional learning in recurrent neural networks (RNNs) can provide a test case for dynamical representation alignment metrics.
We show that the recently proposed Dynamical Similarity Analysis (DSA) is more noise robust and reliably identifies behaviorally relevant representations.
arXiv Detail & Related papers (2024-10-31T16:07:21Z) - Understanding the Functional Roles of Modelling Components in Spiking Neural Networks [9.448298335007465]
Spiking neural networks (SNNs) are promising in achieving high computational efficiency with biological fidelity.
We investigate the functional roles of key modelling components, leakage, reset, and recurrence, in leaky integrate-and-fire (LIF) based SNNs.
Specifically, we find that the leakage plays a crucial role in balancing memory retention and robustness, the reset mechanism is essential for uninterrupted temporal processing and computational efficiency, and the recurrence enriches the capability to model complex dynamics at a cost of robustness degradation.
arXiv Detail & Related papers (2024-03-25T12:13:20Z) - Episodic Memory Theory for the Mechanistic Interpretation of Recurrent
Neural Networks [3.683202928838613]
We propose the Episodic Memory Theory (EMT), illustrating that RNNs can be conceptualized as discrete-time analogs of the recently proposed General Sequential Episodic Memory Model.
We introduce a novel set of algorithmic tasks tailored to probe the variable binding behavior in RNNs.
Our empirical investigations reveal that trained RNNs consistently converge to the variable binding circuit, thus indicating universality in the dynamics of RNNs.
arXiv Detail & Related papers (2023-10-03T20:52:37Z) - Transferability of coVariance Neural Networks and Application to
Interpretable Brain Age Prediction using Anatomical Features [119.45320143101381]
Graph convolutional networks (GCN) leverage topology-driven graph convolutional operations to combine information across the graph for inference tasks.
We have studied GCNs with covariance matrices as graphs in the form of coVariance neural networks (VNNs)
VNNs inherit the scale-free data processing architecture from GCNs and here, we show that VNNs exhibit transferability of performance over datasets whose covariance matrices converge to a limit object.
arXiv Detail & Related papers (2023-05-02T22:15:54Z) - Structural Neural Additive Models: Enhanced Interpretable Machine
Learning [0.0]
In recent years, the field has seen a push towards interpretable neural networks, such as the visually interpretable Neural Additive Models (NAMs)
We propose a further step into the direction of intelligibility beyond the mere visualization of feature effects and propose Structural Neural Additive Models (SNAMs)
A modeling framework that combines classical and clearly interpretable statistical methods with the predictive power of neural applications.
arXiv Detail & Related papers (2023-02-18T09:52:30Z) - On the Intrinsic Structures of Spiking Neural Networks [66.57589494713515]
Recent years have emerged a surge of interest in SNNs owing to their remarkable potential to handle time-dependent and event-driven data.
There has been a dearth of comprehensive studies examining the impact of intrinsic structures within spiking computations.
This work delves deep into the intrinsic structures of SNNs, by elucidating their influence on the expressivity of SNNs.
arXiv Detail & Related papers (2022-06-21T09:42:30Z) - Learning and Generalization in RNNs [11.107204912245841]
We prove that simple recurrent neural networks can learn functions of sequences.
New ideas enable us to extract information from the hidden state of the RNN in our proofs.
arXiv Detail & Related papers (2021-05-31T18:27:51Z) - A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task.
The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Recurrent Neural Network Learning of Performance and Intrinsic
Population Dynamics from Sparse Neural Data [77.92736596690297]
We introduce a novel training strategy that allows learning not only the input-output behavior of an RNN but also its internal network dynamics.
We test the proposed method by training an RNN to simultaneously reproduce internal dynamics and output signals of a physiologically-inspired neural model.
Remarkably, we show that the reproduction of the internal dynamics is successful even when the training algorithm relies on the activities of a small subset of neurons.
arXiv Detail & Related papers (2020-05-05T14:16:54Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.