Related papers: Extracting Weighted Finite Automata from Recurrent Neural Networks for Natural Languages

Extracting Weighted Finite Automata from Recurrent Neural Networks for Natural Languages

URL: http://arxiv.org/abs/2206.14621v1
Date: Mon, 27 Jun 2022 09:30:13 GMT
Title: Extracting Weighted Finite Automata from Recurrent Neural Networks for Natural Languages
Authors: Zeming Wei, Xiyue Zhang and Meng Sun
Abstract summary: Recurrent Neural Networks (RNNs) have achieved tremendous success in sequential data processing. It is quite challenging to interpret and verify RNNs' behaviors directly. We propose a transition rule extraction approach, which is scalable to natural language processing models.
Score: 9.249443355045967
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recurrent Neural Networks (RNNs) have achieved tremendous success in sequential data processing. However, it is quite challenging to interpret and verify RNNs' behaviors directly. To this end, many efforts have been made to extract finite automata from RNNs. Existing approaches such as exact learning are effective in extracting finite-state models to characterize the state dynamics of RNNs for formal languages, but are limited in the scalability to process natural languages. Compositional approaches that are scablable to natural languages fall short in extraction precision. In this paper, we identify the transition sparsity problem that heavily impacts the extraction precision. To address this problem, we propose a transition rule extraction approach, which is scalable to natural language processing models and effective in improving extraction precision. Specifically, we propose an empirical method to complement the missing rules in the transition diagram. In addition, we further adjust the transition matrices to enhance the context-aware ability of the extracted weighted finite automaton (WFA). Finally, we propose two data augmentation tactics to track more dynamic behaviors of the target RNN. Experiments on two popular natural language datasets show that our method can extract WFA from RNN for natural language processing with better precision than existing approaches.

Related papers

DBR: Divergence-Based Regularization for Debiasing Natural Language Understanding Models [50.54264918467997]
Pre-trained language models (PLMs) have achieved impressive results on various natural language processing tasks. Recent research has revealed that these models often rely on superficial features and shortcuts instead of developing a genuine understanding of language. We propose Divergence Based Regularization (DBR) to mitigate this shortcut learning behavior.
arXiv Detail & Related papers (2025-02-25T16:44:10Z)
DeepDFA: Automata Learning through Neural Probabilistic Relaxations [2.3326951882644553]
We introduce DeepDFA, a novel approach to identifying Deterministic Finite Automata (DFAs) from traces. Inspired by both the probabilistic relaxation of DFAs and Recurrent Neural Networks (RNNs), our model offers interpretability post-training, alongside reduced complexity and enhanced training efficiency.
arXiv Detail & Related papers (2024-08-16T09:30:36Z)
In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL) We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z)
On the Representational Capacity of Recurrent Neural Language Models [56.19166912044362]
We show that a rationally weighted RLM with computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions. We also provide a lower bound by showing that under the restriction to real-time computation, such models can simulate deterministic real-time rational PTMs.
arXiv Detail & Related papers (2023-10-19T17:39:47Z)
Advancing Regular Language Reasoning in Linear Recurrent Neural Networks [56.11830645258106]
We study whether linear recurrent neural networks (LRNNs) can learn the hidden rules in training sequences. We propose a new LRNN equipped with a block-diagonal and input-dependent transition matrix. Experiments suggest that the proposed model is the only LRNN capable of performing length extrapolation on regular language tasks.
arXiv Detail & Related papers (2023-09-14T03:36:01Z)
Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks [15.331024247043999]
Recurrent Neural Networks (RNNs) have achieved tremendous success in processing sequential data, yet understanding and analyzing their behaviours remains a significant challenge. We propose a novel framework of Weighted Finite Automata (WFA) extraction and explanation to tackle the limitations for natural language tasks.
arXiv Detail & Related papers (2023-06-24T19:16:56Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Extracting Finite Automata from RNNs Using State Merging [1.8072051868187933]
We propose a new method for extracting finite automata from RNNs inspired by the state merging paradigm from grammatical inference. We demonstrate the effectiveness of our method on the Tomita languages benchmark, where we find that it is able to extract faithful automata from RNNs trained on all languages in the benchmark.
arXiv Detail & Related papers (2022-01-28T23:03:25Z)
Distributionally Robust Recurrent Decoders with Random Network Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference. We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z)
Synthesizing Context-free Grammars from Recurrent Neural Networks (Extended Version) [6.3455238301221675]
We present an algorithm for extracting the context free grammars (CFGs) from a trained recurrent neural network (RNN) We develop a new framework, pattern rule sets (PRSs), which describe sequences of deterministic finite automata (DFAs) that approximate a non-regular language. We show how the PRS may converted into a CFG, enabling a familiar and useful presentation of the learned language.
arXiv Detail & Related papers (2021-01-20T16:22:25Z)
Reprogramming Language Models for Molecular Representation Learning [65.00999660425731]
We propose Representation Reprogramming via Dictionary Learning (R2DL) for adversarially reprogramming pretrained language models for molecular learning tasks. The adversarial program learns a linear transformation between a dense source model input space (language data) and a sparse target model input space (e.g., chemical and biological molecule data) using a k-SVD solver. R2DL achieves the baseline established by state of the art toxicity prediction models trained on domain-specific data and outperforms the baseline in a limited training-data setting.
arXiv Detail & Related papers (2020-12-07T05:50:27Z)
Distillation of Weighted Automata from Recurrent Neural Networks using a Spectral Approach [0.0]
This paper is an attempt to bridge the gap between deep learning and grammatical inference. It provides an algorithm to extract a formal language from any recurrent neural network trained for language modelling.
arXiv Detail & Related papers (2020-09-28T07:04:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.