Synthesizing Context-free Grammars from Recurrent Neural Networks
(Extended Version)
- URL: http://arxiv.org/abs/2101.08200v2
- Date: Tue, 9 Feb 2021 21:28:28 GMT
- Title: Synthesizing Context-free Grammars from Recurrent Neural Networks
(Extended Version)
- Authors: Daniel M. Yellin, Gail Weiss
- Abstract summary: We present an algorithm for extracting the context free grammars (CFGs) from a trained recurrent neural network (RNN)
We develop a new framework, pattern rule sets (PRSs), which describe sequences of deterministic finite automata (DFAs) that approximate a non-regular language.
We show how the PRS may converted into a CFG, enabling a familiar and useful presentation of the learned language.
- Score: 6.3455238301221675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an algorithm for extracting a subclass of the context free
grammars (CFGs) from a trained recurrent neural network (RNN). We develop a new
framework, pattern rule sets (PRSs), which describe sequences of deterministic
finite automata (DFAs) that approximate a non-regular language. We present an
algorithm for recovering the PRS behind a sequence of such automata, and apply
it to the sequences of automata extracted from trained RNNs using the L*
algorithm. We then show how the PRS may converted into a CFG, enabling a
familiar and useful presentation of the learned language.
Extracting the learned language of an RNN is important to facilitate
understanding of the RNN and to verify its correctness. Furthermore, the
extracted CFG can augment the RNN in classifying correct sentences, as the
RNN's predictive accuracy decreases when the recursion depth and distance
between matching delimiters of its input sequences increases.
Related papers
- Advancing Regular Language Reasoning in Linear Recurrent Neural Networks [56.11830645258106]
We study whether linear recurrent neural networks (LRNNs) can learn the hidden rules in training sequences.
We propose a new LRNN equipped with a block-diagonal and input-dependent transition matrix.
Experiments suggest that the proposed model is the only LRNN capable of performing length extrapolation on regular language tasks.
arXiv Detail & Related papers (2023-09-14T03:36:01Z) - Graph Neural Networks for Contextual ASR with the Tree-Constrained
Pointer Generator [9.053645441056256]
This paper proposes an innovative method for achieving end-to-end contextual ASR using graph neural network (GNN) encodings.
GNN encodings facilitate lookahead for future word pieces in the process of ASR decoding at each tree node.
The performance of the systems was evaluated using the Librispeech and AMI corpus, following the visual-grounded contextual ASR pipeline.
arXiv Detail & Related papers (2023-05-30T08:20:58Z) - Return of the RNN: Residual Recurrent Networks for Invertible Sentence
Embeddings [0.0]
This study presents a novel model for invertible sentence embeddings using a residual recurrent network trained on an unsupervised encoding task.
Rather than the probabilistic outputs common to neural machine translation models, our approach employs a regression-based output layer to reconstruct the input sequence's word vectors.
The model achieves high accuracy and fast training with the ADAM, a significant finding given that RNNs typically require memory units, such as LSTMs, or second-order optimization methods.
arXiv Detail & Related papers (2023-03-23T15:59:06Z) - Sequence Transduction with Graph-based Supervision [96.04967815520193]
We present a new transducer objective function that generalizes the RNN-T loss to accept a graph representation of the labels.
We demonstrate that transducer-based ASR with CTC-like lattice achieves better results compared to standard RNN-T.
arXiv Detail & Related papers (2021-11-01T21:51:42Z) - Learning Hierarchical Structures with Differentiable Nondeterministic
Stacks [25.064819128982556]
We present a stack RNN model based on the recently proposed Nondeterministic Stack RNN (NS-RNN)
We show that the NS-RNN achieves lower cross-entropy than all previous stack RNNs on five context-free language modeling tasks.
We also propose a restricted version of the NS-RNN that makes it practical to use for language modeling on natural language.
arXiv Detail & Related papers (2021-09-05T03:25:23Z) - NSL: Hybrid Interpretable Learning From Noisy Raw Data [66.15862011405882]
This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data.
NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics.
We demonstrate that NSL is able to learn robust rules from MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines.
arXiv Detail & Related papers (2020-12-09T13:02:44Z) - Connecting Weighted Automata, Tensor Networks and Recurrent Neural
Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks.
We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z) - Distillation of Weighted Automata from Recurrent Neural Networks using a
Spectral Approach [0.0]
This paper is an attempt to bridge the gap between deep learning and grammatical inference.
It provides an algorithm to extract a formal language from any recurrent neural network trained for language modelling.
arXiv Detail & Related papers (2020-09-28T07:04:15Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Recognizing Long Grammatical Sequences Using Recurrent Networks
Augmented With An External Differentiable Stack [73.48927855855219]
Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction.
RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems.
One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack.
In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms.
arXiv Detail & Related papers (2020-04-04T14:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.