Extracting Finite Automata from RNNs Using State Merging
- URL: http://arxiv.org/abs/2201.12451v1
- Date: Fri, 28 Jan 2022 23:03:25 GMT
- Title: Extracting Finite Automata from RNNs Using State Merging
- Authors: William Merrill and Nikolaos Tsilivis
- Abstract summary: We propose a new method for extracting finite automata from RNNs inspired by the state merging paradigm from grammatical inference.
We demonstrate the effectiveness of our method on the Tomita languages benchmark, where we find that it is able to extract faithful automata from RNNs trained on all languages in the benchmark.
- Score: 1.8072051868187933
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One way to interpret the behavior of a blackbox recurrent neural network
(RNN) is to extract from it a more interpretable discrete computational model,
like a finite state machine, that captures its behavior. In this work, we
propose a new method for extracting finite automata from RNNs inspired by the
state merging paradigm from grammatical inference. We demonstrate the
effectiveness of our method on the Tomita languages benchmark, where we find
that it is able to extract faithful automata from RNNs trained on all languages
in the benchmark. We find that extraction performance is aided by the number of
data provided during the extraction process, as well as, curiously, whether the
RNN model is trained for additional epochs after perfectly learning its target
language. We use our method to analyze this phenomenon, finding that training
beyond convergence is useful because it leads to compression of the internal
state space of the RNN. This finding demonstrates how our method can be used
for interpretability and analysis of trained RNN models.
Related papers
- Learning Useful Representations of Recurrent Neural Network Weight Matrices [30.583752432727326]
Recurrent Neural Networks (RNNs) are general-purpose parallel-sequential computers.
How to learn useful representations of RNN weights that facilitate RNN analysis as well as downstream tasks?
We consider several mechanistic approaches for RNN weights and adapt the permutation equivariant Deep Weight Space layer for RNNs.
Our two novel functionalist approaches extract information from RNN weights by 'interrogating' the RNN through probing inputs.
arXiv Detail & Related papers (2024-03-18T17:32:23Z) - DeepCover: Advancing RNN Test Coverage and Online Error Prediction using
State Machine Extraction [0.0]
Recurrent neural networks (RNNs) have emerged as powerful tools for processing sequential data in various fields, including natural language processing and speech recognition.
The lack of explainability in RNN models has limited their interpretability, posing challenges in understanding their internal workings.
This paper proposes a methodology for extracting a state machine (SM) from an RNN-based model to provide insights into its internal function.
arXiv Detail & Related papers (2024-02-10T14:45:23Z) - On the Representational Capacity of Recurrent Neural Language Models [56.19166912044362]
We show that a rationally weighted RLM with computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions.
We also provide a lower bound by showing that under the restriction to real-time computation, such models can simulate deterministic real-time rational PTMs.
arXiv Detail & Related papers (2023-10-19T17:39:47Z) - On the Computational Complexity and Formal Hierarchy of Second Order
Recurrent Neural Networks [59.85314067235965]
We extend the theoretical foundation for the $2nd$-order recurrent network ($2nd$ RNN)
We prove there exists a class of a $2nd$ RNN that is Turing-complete with bounded time.
We also demonstrate that $2$nd order RNNs, without memory, outperform modern-day models such as vanilla RNNs and gated recurrent units in recognizing regular grammars.
arXiv Detail & Related papers (2023-09-26T06:06:47Z) - Advancing Regular Language Reasoning in Linear Recurrent Neural Networks [56.11830645258106]
We study whether linear recurrent neural networks (LRNNs) can learn the hidden rules in training sequences.
We propose a new LRNN equipped with a block-diagonal and input-dependent transition matrix.
Experiments suggest that the proposed model is the only LRNN capable of performing length extrapolation on regular language tasks.
arXiv Detail & Related papers (2023-09-14T03:36:01Z) - DeepSeer: Interactive RNN Explanation and Debugging via State
Abstraction [10.110976560799612]
Recurrent Neural Networks (RNNs) have been widely used in Natural Language Processing (NLP) tasks.
DeepSeer is an interactive system that provides both global and local explanations of RNN behavior.
arXiv Detail & Related papers (2023-03-02T21:08:17Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Extracting Weighted Finite Automata from Recurrent Neural Networks for
Natural Languages [9.249443355045967]
Recurrent Neural Networks (RNNs) have achieved tremendous success in sequential data processing.
It is quite challenging to interpret and verify RNNs' behaviors directly.
We propose a transition rule extraction approach, which is scalable to natural language processing models.
arXiv Detail & Related papers (2022-06-27T09:30:13Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval.
Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token.
We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z) - Distillation of Weighted Automata from Recurrent Neural Networks using a
Spectral Approach [0.0]
This paper is an attempt to bridge the gap between deep learning and grammatical inference.
It provides an algorithm to extract a formal language from any recurrent neural network trained for language modelling.
arXiv Detail & Related papers (2020-09-28T07:04:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.