Interpreting Deep Learning Models in Natural Language Processing: A
Review
- URL: http://arxiv.org/abs/2110.10470v1
- Date: Wed, 20 Oct 2021 10:17:04 GMT
- Title: Interpreting Deep Learning Models in Natural Language Processing: A
Review
- Authors: Xiaofei Sun, Diyi Yang, Xiaoya Li, Tianwei Zhang, Yuxian Meng, Qiu
Han, Guoyin Wang, Eduard Hovy, Jiwei Li
- Abstract summary: A long-standing criticism against neural network models is the lack of interpretability.
In this survey, we provide a comprehensive review of various interpretation methods for neural models in NLP.
- Score: 33.80537635077772
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural network models have achieved state-of-the-art performances in a wide
range of natural language processing (NLP) tasks. However, a long-standing
criticism against neural network models is the lack of interpretability, which
not only reduces the reliability of neural NLP systems but also limits the
scope of their applications in areas where interpretability is essential (e.g.,
health care applications). In response, the increasing interest in interpreting
neural NLP models has spurred a diverse array of interpretation methods over
recent years. In this survey, we provide a comprehensive review of various
interpretation methods for neural models in NLP. We first stretch out a
high-level taxonomy for interpretation methods in NLP, i.e., training-based
approaches, test-based approaches, and hybrid approaches. Next, we describe
sub-categories in each category in detail, e.g., influence-function based
methods, KNN-based methods, attention-based models, saliency-based methods,
perturbation-based methods, etc. We point out deficiencies of current methods
and suggest some avenues for future research.
Related papers
- Recent advancements in computational morphology : A comprehensive survey [0.11606731918609076]
Computational morphology handles the language processing at the word level.
Morpheme boundary detection, lemmatization, morphological feature tagging, morphological reinflection etc.
arXiv Detail & Related papers (2024-06-08T10:07:33Z) - Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process.
We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - Adversarial Attacks on the Interpretation of Neuron Activation
Maximization [70.5472799454224]
Activation-maximization approaches are used to interpret and analyze trained deep-learning models.
In this work, we consider the concept of an adversary manipulating a model for the purpose of deceiving the interpretation.
arXiv Detail & Related papers (2023-06-12T19:54:33Z) - Evaluating Neuron Interpretation Methods of NLP Models [28.71369775524347]
We propose an evaluation framework that measures the compatibility of a neuron analysis method with other methods.
We present a comparative analysis of a large set of neuron interpretation methods.
It enables the evaluation of any new method using 20 concepts and across three pre-trained models.
arXiv Detail & Related papers (2023-01-30T02:04:35Z) - Neural Additive Models for Location Scale and Shape: A Framework for
Interpretable Neural Regression Beyond the Mean [1.0923877073891446]
Deep neural networks (DNNs) have proven to be highly effective in a variety of tasks.
Despite this success, the inner workings of DNNs are often not transparent.
This lack of interpretability has led to increased research on inherently interpretable neural networks.
arXiv Detail & Related papers (2023-01-27T17:06:13Z) - On the Explainability of Natural Language Processing Deep Models [3.0052400859458586]
Methods have been developed to address the challenges and present satisfactory explanations on Natural Language Processing (NLP) models.
Motivated to democratize ExAI methods in the NLP field, we present in this work a survey that studies model-agnostic as well as model-specific explainability methods on NLP models.
arXiv Detail & Related papers (2022-10-13T11:59:39Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - A Survey of Active Learning for Text Classification using Deep Neural
Networks [1.2310316230437004]
Natural language processing (NLP) and neural networks (NNs) have both undergone significant changes in recent years.
For active learning (AL) purposes, NNs are, however, less commonly used -- despite their current popularity.
arXiv Detail & Related papers (2020-08-17T12:53:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.