Building Interpretable Interaction Trees for Deep NLP Models
- URL: http://arxiv.org/abs/2007.04298v2
- Date: Sat, 16 Jan 2021 15:07:07 GMT
- Title: Building Interpretable Interaction Trees for Deep NLP Models
- Authors: Die Zhang, Huilin Zhou, Hao Zhang, Xiaoyi Bao, Da Huo, Ruizhao Chen,
Xu Cheng, Mengyue Wu, Quanshi Zhang
- Abstract summary: Six metrics are proposed to analyze properties of interactions between constituents in a sentence.
Our method is used to quantify word interactions encoded inside the BERT, ELMo, LSTM, CNN, and Transformer networks.
- Score: 38.50154540331266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a method to disentangle and quantify interactions among
words that are encoded inside a DNN for natural language processing. We
construct a tree to encode salient interactions extracted by the DNN. Six
metrics are proposed to analyze properties of interactions between constituents
in a sentence. The interaction is defined based on Shapley values of words,
which are considered as an unbiased estimation of word contributions to the
network prediction. Our method is used to quantify word interactions encoded
inside the BERT, ELMo, LSTM, CNN, and Transformer networks. Experimental
results have provided a new perspective to understand these DNNs, and have
demonstrated the effectiveness of our method.
Related papers
- Defining and Extracting generalizable interaction primitives from DNNs [22.79131582164054]
We develop a new method to extract interactions that are shared by different deep neural networks (DNNs)
Experiments show that the extracted interactions can better reflect common knowledge shared by different DNNs.
arXiv Detail & Related papers (2024-01-29T17:21:41Z) - Explaining Interactions Between Text Spans [50.70253702800355]
Reasoning over spans of tokens from different parts of the input is essential for natural language understanding.
We introduce SpanEx, a dataset of human span interaction explanations for two NLU tasks: NLI and FC.
We then investigate the decision-making processes of multiple fine-tuned large language models in terms of the employed connections between spans.
arXiv Detail & Related papers (2023-10-20T13:52:37Z) - Recognizing semantic relation in sentence pairs using Tree-RNNs and
Typed dependencies [0.0]
This work proposes an improvement to Dependency Tree-RNN (DT-RNN) using the grammatical relationship type identified in the dependency parse.
Experiments on semantic relatedness scoring (SRS) and recognizing textual entailment (RTE) in sentence pairs using SICK dataset show encouraging results.
arXiv Detail & Related papers (2022-01-13T06:59:27Z) - Explaining Neural Network Predictions on Sentence Pairs via Learning
Word-Group Masks [21.16662651409811]
We propose the Group Mask (GMASK) method to implicitly detect word correlations by grouping correlated words from the input text pair together.
The proposed method is evaluated with two different model architectures (decomposable attention model and BERT) across four datasets.
arXiv Detail & Related papers (2021-04-09T17:14:34Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Interpreting Multivariate Shapley Interactions in DNNs [33.67263820904767]
This paper aims to explain deep neural networks (DNNs) from the perspective of multivariate interactions.
In this paper, we define and quantify the significance of interactions among multiple input variables of the DNN.
arXiv Detail & Related papers (2020-10-10T17:02:51Z) - Combining word embeddings and convolutional neural networks to detect
duplicated questions [0.0]
We propose a simple approach to identifying semantically similar questions by combining the strengths of word embeddings and Convolutional Neural Networks.
Our network is trained on the Quora dataset, which contains over 400k question pairs.
arXiv Detail & Related papers (2020-06-08T12:30:25Z) - Exploiting Syntactic Structure for Better Language Modeling: A Syntactic
Distance Approach [78.77265671634454]
We make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called "syntactic distances"
Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.
arXiv Detail & Related papers (2020-05-12T15:35:00Z) - SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks [70.64925872964416]
We present SkipGNN, a graph neural network approach for the prediction of molecular interactions.
SkipGNN predicts molecular interactions by not only aggregating information from direct interactions but also from second-order interactions.
We show that SkipGNN achieves superior and robust performance, outperforming existing methods by up to 28.8% of area.
arXiv Detail & Related papers (2020-04-30T16:55:58Z) - Linguistically Driven Graph Capsule Network for Visual Question
Reasoning [153.76012414126643]
We propose a hierarchical compositional reasoning model called the "Linguistically driven Graph Capsule Network"
The compositional process is guided by the linguistic parse tree. Specifically, we bind each capsule in the lowest layer to bridge the linguistic embedding of a single word in the original question with visual evidence.
Experiments on the CLEVR dataset, CLEVR compositional generation test, and FigureQA dataset demonstrate the effectiveness and composition generalization ability of our end-to-end model.
arXiv Detail & Related papers (2020-03-23T03:34:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.