Model Explainability in Deep Learning Based Natural Language Processing
- URL: http://arxiv.org/abs/2106.07410v1
- Date: Mon, 14 Jun 2021 13:23:20 GMT
- Title: Model Explainability in Deep Learning Based Natural Language Processing
- Authors: Shafie Gholizadeh and Nengfeng Zhou
- Abstract summary: We reviewed and compared some popular machine learning model explainability methodologies.
We applied one of the NLP explainability methods to a NLP classification model.
We identified some common issues due to the special natures of NLP models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) model explainability has received growing attention,
especially in the area related to model risk and regulations. In this paper, we
reviewed and compared some popular ML model explainability methodologies,
especially those related to Natural Language Processing (NLP) models. We then
applied one of the NLP explainability methods Layer-wise Relevance Propagation
(LRP) to a NLP classification model. We used the LRP method to derive a
relevance score for each word in an instance, which is a local explainability.
The relevance scores are then aggregated together to achieve global variable
importance of the model. Through the case study, we also demonstrated how to
apply the local explainability method to false positive and false negative
instances to discover the weakness of a NLP model. These analysis can help us
to understand NLP models better and reduce the risk due to the black-box nature
of NLP models. We also identified some common issues due to the special natures
of NLP models and discussed how explainability analysis can act as a control to
detect these issues after the model has been trained.
Related papers
- Zero-shot LLM-guided Counterfactual Generation: A Case Study on NLP Model Evaluation [15.254775341371364]
We explore the possibility of leveraging large language models for zero-shot counterfactual generation.
We propose a structured pipeline to facilitate this generation, and we hypothesize that the instruction-following and textual understanding capabilities of recent LLMs can be effectively leveraged.
arXiv Detail & Related papers (2024-05-08T03:57:45Z) - Faithful Explanations of Black-box NLP Models Using LLM-generated
Counterfactuals [67.64770842323966]
Causal explanations of predictions of NLP systems are essential to ensure safety and establish trust.
Existing methods often fall short of explaining model predictions effectively or efficiently.
We propose two approaches for counterfactual (CF) approximation.
arXiv Detail & Related papers (2023-10-01T07:31:04Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Large Language Models as Annotators: Enhancing Generalization of NLP
Models at Minimal Cost [6.662800021628275]
We study the use of large language models (LLMs) for annotating inputs and improving the generalization of NLP models.
We propose a sampling strategy based on the difference in prediction scores between the base model and the finetuned NLP model.
arXiv Detail & Related papers (2023-06-27T19:29:55Z) - KNOW How to Make Up Your Mind! Adversarially Detecting and Alleviating
Inconsistencies in Natural Language Explanations [52.33256203018764]
We leverage external knowledge bases to significantly improve on an existing adversarial attack for detecting inconsistent NLEs.
We show that models with higher NLE quality do not necessarily generate fewer inconsistencies.
arXiv Detail & Related papers (2023-06-05T15:51:58Z) - On the Explainability of Natural Language Processing Deep Models [3.0052400859458586]
Methods have been developed to address the challenges and present satisfactory explanations on Natural Language Processing (NLP) models.
Motivated to democratize ExAI methods in the NLP field, we present in this work a survey that studies model-agnostic as well as model-specific explainability methods on NLP models.
arXiv Detail & Related papers (2022-10-13T11:59:39Z) - Towards Faithful Model Explanation in NLP: A Survey [48.690624266879155]
End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to understand.
One desideratum of model explanation is faithfulness, i.e. an explanation should accurately represent the reasoning process behind the model's prediction.
We review over 110 model explanation methods in NLP through the lens of faithfulness.
arXiv Detail & Related papers (2022-09-22T21:40:51Z) - Interpreting Deep Learning Models in Natural Language Processing: A
Review [33.80537635077772]
A long-standing criticism against neural network models is the lack of interpretability.
In this survey, we provide a comprehensive review of various interpretation methods for neural models in NLP.
arXiv Detail & Related papers (2021-10-20T10:17:04Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Towards Interpretable Deep Learning Models for Knowledge Tracing [62.75876617721375]
We propose to adopt the post-hoc method to tackle the interpretability issue for deep learning based knowledge tracing (DLKT) models.
Specifically, we focus on applying the layer-wise relevance propagation (LRP) method to interpret RNN-based DLKT model.
Experiment results show the feasibility using the LRP method for interpreting the DLKT model's predictions.
arXiv Detail & Related papers (2020-05-13T04:03:21Z) - Considering Likelihood in NLP Classification Explanations with Occlusion
and Language Modeling [11.594541142399223]
Occlusion is a well established method that provides explanations on discrete language data.
We argue that current Occlusion-based methods often produce invalid or syntactically incorrect language data.
We propose OLM: a novel explanation method that combines Occlusion and language models to sample valid and syntactically correct replacements.
arXiv Detail & Related papers (2020-04-21T10:37:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.