Related papers: Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks

Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks

URL: http://arxiv.org/abs/2306.14040v1
Date: Sat, 24 Jun 2023 19:16:56 GMT
Title: Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks
Authors: Zeming Wei, Xiyue Zhang, Yihao Zhang, Meng Sun
Abstract summary: Recurrent Neural Networks (RNNs) have achieved tremendous success in processing sequential data, yet understanding and analyzing their behaviours remains a significant challenge. We propose a novel framework of Weighted Finite Automata (WFA) extraction and explanation to tackle the limitations for natural language tasks.
Score: 15.331024247043999
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recurrent Neural Networks (RNNs) have achieved tremendous success in processing sequential data, yet understanding and analyzing their behaviours remains a significant challenge. To this end, many efforts have been made to extract finite automata from RNNs, which are more amenable for analysis and explanation. However, existing approaches like exact learning and compositional approaches for model extraction have limitations in either scalability or precision. In this paper, we propose a novel framework of Weighted Finite Automata (WFA) extraction and explanation to tackle the limitations for natural language tasks. First, to address the transition sparsity and context loss problems we identified in WFA extraction for natural language tasks, we propose an empirical method to complement missing rules in the transition diagram, and adjust transition matrices to enhance the context-awareness of the WFA. We also propose two data augmentation tactics to track more dynamic behaviours of RNN, which further allows us to improve the extraction precision. Based on the extracted model, we propose an explanation method for RNNs including a word embedding method -- Transition Matrix Embeddings (TME) and TME-based task oriented explanation for the target RNN. Our evaluation demonstrates the advantage of our method in extraction precision than existing approaches, and the effectiveness of TME-based explanation method in applications to pretraining and adversarial example generation.

Related papers

Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning. We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads. We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
Directed Exploration in Reinforcement Learning from Linear Temporal Logic [59.707408697394534]
Linear temporal logic (LTL) is a powerful language for task specification in reinforcement learning. We show that the synthesized reward signal remains fundamentally sparse, making exploration challenging. We show how better exploration can be achieved by further leveraging the specification and casting its corresponding Limit Deterministic B"uchi Automaton (LDBA) as a Markov reward process.
arXiv Detail & Related papers (2024-08-18T14:25:44Z)
DeepDFA: Automata Learning through Neural Probabilistic Relaxations [2.3326951882644553]
We introduce DeepDFA, a novel approach to identifying Deterministic Finite Automata (DFAs) from traces. Inspired by both the probabilistic relaxation of DFAs and Recurrent Neural Networks (RNNs), our model offers interpretability post-training, alongside reduced complexity and enhanced training efficiency.
arXiv Detail & Related papers (2024-08-16T09:30:36Z)
Deep Neural Network for Constraint Acquisition through Tailored Loss Function [0.0]
The significance of learning constraints from data is underscored by its potential applications in real-world problem-solving. This work introduces a novel approach grounded in Deep Neural Network (DNN) based on Symbolic Regression.
arXiv Detail & Related papers (2024-03-04T13:47:33Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets. We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space. Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z)
Extracting Weighted Finite Automata from Recurrent Neural Networks for Natural Languages [9.249443355045967]
Recurrent Neural Networks (RNNs) have achieved tremendous success in sequential data processing. It is quite challenging to interpret and verify RNNs' behaviors directly. We propose a transition rule extraction approach, which is scalable to natural language processing models.
arXiv Detail & Related papers (2022-06-27T09:30:13Z)
Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model. A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations. We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
Distillation of Weighted Automata from Recurrent Neural Networks using a Spectral Approach [0.0]
This paper is an attempt to bridge the gap between deep learning and grammatical inference. It provides an algorithm to extract a formal language from any recurrent neural network trained for language modelling.
arXiv Detail & Related papers (2020-09-28T07:04:15Z)
Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints [31.829932777445894]
A major drawback in the application of RNN-based policies is the difficulty in providing formal guarantees on the satisfaction of behavioral specifications. By integrating techniques from formal methods and machine learning, we propose an approach to automatically extract a finite-state controller from an RNN.
arXiv Detail & Related papers (2020-02-13T16:38:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.