Related papers: State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions

State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions

URL: http://arxiv.org/abs/2212.05178v1
Date: Sat, 10 Dec 2022 02:06:27 GMT
Title: State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions
Authors: Cheng Wang, Carolin Lawrence, Mathias Niepert
Abstract summary: State-regularization makes RNNs transition between a finite set of learnable states. We evaluate state-regularized RNNs on (1) regular languages for the purpose of automata extraction; (2) non-regular languages such as balanced parentheses and palindromes where external memory is required; and (3) real-word sequence learning tasks for sentiment analysis, visual object recognition and text categorisation.
Score: 29.84563789289183
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recurrent neural networks are a widely used class of neural architectures. They have, however, two shortcomings. First, they are often treated as black-box models and as such it is difficult to understand what exactly they learn as well as how they arrive at a particular prediction. Second, they tend to work poorly on sequences requiring long-term memorization, despite having this capacity in principle. We aim to address both shortcomings with a class of recurrent networks that use a stochastic state transition mechanism between cell applications. This mechanism, which we term state-regularization, makes RNNs transition between a finite set of learnable states. We evaluate state-regularized RNNs on (1) regular languages for the purpose of automata extraction; (2) non-regular languages such as balanced parentheses and palindromes where external memory is required; and (3) real-word sequence learning tasks for sentiment analysis, visual object recognition and text categorisation. We show that state-regularization (a) simplifies the extraction of finite state automata that display an RNN's state transition dynamic; (b) forces RNNs to operate more like automata with external memory and less like finite state machines, which potentiality leads to a more structural memory; (c) leads to better interpretability and explainability of RNNs by leveraging the probabilistic finite state transition mechanism over time steps.

Related papers

Recurrent Neural Language Models as Probabilistic Finite-state Automata [66.23172872811594]
We study what classes of probability distributions RNN LMs can represent. We show that simple RNNs are equivalent to a subclass of probabilistic finite-state automata. These results present a first step towards characterizing the classes of distributions RNN LMs can represent.
arXiv Detail & Related papers (2023-10-08T13:36:05Z)
On the Computational Complexity and Formal Hierarchy of Second Order Recurrent Neural Networks [59.85314067235965]
We extend the theoretical foundation for the $2nd$-order recurrent network ($2nd$ RNN) We prove there exists a class of a $2nd$ RNN that is Turing-complete with bounded time. We also demonstrate that $2$nd order RNNs, without memory, outperform modern-day models such as vanilla RNNs and gated recurrent units in recognizing regular grammars.
arXiv Detail & Related papers (2023-09-26T06:06:47Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Artificial Neuronal Ensembles with Learned Context Dependent Gating [0.0]
We introduce Learned Context Dependent Gating (LXDG), a method to flexibly allocate and recall artificial neuronal ensembles' Activities in the hidden layers of the network are modulated by gates, which are dynamically produced during training. We demonstrate the ability of this method to alleviate catastrophic forgetting on continual learning benchmarks.
arXiv Detail & Related papers (2023-01-17T20:52:48Z)
Recurrent Neural Networks for Learning Long-term Temporal Dependencies with Reanalysis of Time Scale Representation [16.32068729107421]
We argue that the interpretation of a forget gate as a temporal representation is valid when the gradient of loss with respect to the state decreases exponentially as time goes back. We propose an approach to construct new RNNs that can represent a longer time scale than conventional models.
arXiv Detail & Related papers (2021-11-05T06:22:58Z)
Learning and Generalization in RNNs [11.107204912245841]
We prove that simple recurrent neural networks can learn functions of sequences. New ideas enable us to extract information from the hidden state of the RNN in our proofs.
arXiv Detail & Related papers (2021-05-31T18:27:51Z)
Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Stability of Internal States in Recurrent Neural Networks Trained on Regular Languages [0.0]
We study the stability of neural networks trained to recognize regular languages. In this saturated regime, analysis of the network activation shows a set of clusters that resemble discrete states in a finite state machine. We show that transitions between these states in response to input symbols are deterministic and stable.
arXiv Detail & Related papers (2020-06-18T19:50:15Z)
Recognizing Long Grammatical Sequences Using Recurrent Networks Augmented With An External Differentiable Stack [73.48927855855219]
Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems. One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack. In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms.
arXiv Detail & Related papers (2020-04-04T14:19:15Z)
Going in circles is the way forward: the role of recurrence in visual inference [0.0]
State-of-the-art neural network models for visual recognition rely heavily or exclusively on feedforward computation. This important insight suggests that computational neuroscientists may not need to engage recurrent computation. We argue that FNNs are a special case of RNNs and that computational neuroscientists and engineers should engage recurrence to understand how brains and machines can achieve greater and more flexible computational depth.
arXiv Detail & Related papers (2020-03-26T19:53:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.