Distance and Equivalence between Finite State Machines and Recurrent
Neural Networks: Computational results
- URL: http://arxiv.org/abs/2004.00478v1
- Date: Wed, 1 Apr 2020 14:48:59 GMT
- Title: Distance and Equivalence between Finite State Machines and Recurrent
Neural Networks: Computational results
- Authors: Reda Marzouk and Colin de la Higuera
- Abstract summary: We show some results related to the problem of extracting Finite State Machine based models from trained RNN Language models.
Our reduction technique from 3-SAT makes this latter fact easily generalizable to other RNN architectures.
- Score: 0.348097307252416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The need of interpreting Deep Learning (DL) models has led, during the past
years, to a proliferation of works concerned by this issue. Among strategies
which aim at shedding some light on how information is represented internally
in DL models, one consists in extracting symbolic rule-based machines from
connectionist models that are supposed to approximate well their behaviour. In
order to better understand how reasonable these approximation strategies are,
we need to know the computational complexity of measuring the quality of
approximation. In this article, we will prove some computational results
related to the problem of extracting Finite State Machine (FSM) based models
from trained RNN Language models. More precisely, we'll show the following: (a)
For general weighted RNN-LMs with a single hidden layer and a ReLu activation:
- The equivalence problem of a PDFA/PFA/WFA and a weighted first-order RNN-LM
is undecidable; - As a corollary, the distance problem between languages
generated by PDFA/PFA/WFA and that of a weighted RNN-LM is not recursive; -The
intersection between a DFA and the cut language of a weighted RNN-LM is
undecidable; - The equivalence of a PDFA/PFA/WFA and weighted RNN-LM in a
finite support is EXP-Hard; (b) For consistent weight RNN-LMs with any
computable activation function: - The Tcheybechev distance approximation is
decidable; - The Tcheybechev distance approximation in a finite support is
NP-Hard. Moreover, our reduction technique from 3-SAT makes this latter fact
easily generalizable to other RNN architectures (e.g. LSTMs/RNNs), and RNNs
with finite precision.
Related papers
- Learning Useful Representations of Recurrent Neural Network Weight Matrices [30.583752432727326]
Recurrent Neural Networks (RNNs) are general-purpose parallel-sequential computers.
How to learn useful representations of RNN weights that facilitate RNN analysis as well as downstream tasks?
We consider several mechanistic approaches for RNN weights and adapt the permutation equivariant Deep Weight Space layer for RNNs.
Our two novel functionalist approaches extract information from RNN weights by 'interrogating' the RNN through probing inputs.
arXiv Detail & Related papers (2024-03-18T17:32:23Z) - On the Representational Capacity of Recurrent Neural Language Models [56.19166912044362]
We show that a rationally weighted RLM with computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions.
We also provide a lower bound by showing that under the restriction to real-time computation, such models can simulate deterministic real-time rational PTMs.
arXiv Detail & Related papers (2023-10-19T17:39:47Z) - Recurrent Neural Language Models as Probabilistic Finite-state Automata [66.23172872811594]
We study what classes of probability distributions RNN LMs can represent.
We show that simple RNNs are equivalent to a subclass of probabilistic finite-state automata.
These results present a first step towards characterizing the classes of distributions RNN LMs can represent.
arXiv Detail & Related papers (2023-10-08T13:36:05Z) - On the Computational Complexity and Formal Hierarchy of Second Order
Recurrent Neural Networks [59.85314067235965]
We extend the theoretical foundation for the $2nd$-order recurrent network ($2nd$ RNN)
We prove there exists a class of a $2nd$ RNN that is Turing-complete with bounded time.
We also demonstrate that $2$nd order RNNs, without memory, outperform modern-day models such as vanilla RNNs and gated recurrent units in recognizing regular grammars.
arXiv Detail & Related papers (2023-09-26T06:06:47Z) - Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.
In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Multilevel Bayesian Deep Neural Networks [0.5892638927736115]
We consider inference associated with deep neural networks (DNNs) and in particular, trace-class neural network (TNN) priors.
TNN priors are defined on functions with infinitely many hidden units, and have strongly convergent approximations with finitely many hidden units.
In this paper, we leverage the strong convergence of TNN in order to apply Multilevel Monte Carlo (MLMC) to these models.
arXiv Detail & Related papers (2022-03-24T09:49:27Z) - Adaptive Discounting of Implicit Language Models in RNN-Transducers [33.63456351411599]
We show how a lightweight adaptive LM discounting technique can be used with any RNN-T architecture.
We obtain up to 4% and 14% relative reductions in overall WER and rare word PER, respectively, on a conversational, code-mixed Hindi-English ASR task.
arXiv Detail & Related papers (2022-02-21T08:44:56Z) - Explainable Natural Language Processing with Matrix Product States [2.3243389656894595]
We perform a systematic analysis of RNNs' behaviors in a ubiquitous NLP task, the sentiment analysis of movie reviews.
We show that single-layer RACs possess a maximum information propagation capacity.
Our work sheds light on the phenomenology of learning in RACs and more generally on the explainability aspects of RNNs for NLP.
arXiv Detail & Related papers (2021-12-16T05:10:32Z) - Connecting Weighted Automata, Tensor Networks and Recurrent Neural
Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks.
We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z) - On Computability, Learnability and Extractability of Finite State
Machines from Recurrent Neural Networks [0.0]
This work aims at shedding some light on connections between finite state machines (FSMs), and recurrent neural networks (RNNs)
Examined connections in this master's thesis is threefold: the extractability of finite state machines from recurrent neural networks, learnability aspects and computationnal links.
arXiv Detail & Related papers (2020-09-10T15:55:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.