Tell Model Where to Attend: Improving Interpretability of Aspect-Based
Sentiment Classification via Small Explanation Annotations
- URL: http://arxiv.org/abs/2302.10479v1
- Date: Tue, 21 Feb 2023 06:55:08 GMT
- Title: Tell Model Where to Attend: Improving Interpretability of Aspect-Based
Sentiment Classification via Small Explanation Annotations
- Authors: Zhenxiao Cheng, Jie Zhou, Wen Wu, Qin Chen, Liang He
- Abstract summary: We propose an textbfInterpretation-textbfEnhanced textbfGradient-based framework for textbfABSC via a small number of explanation annotations, namely textttIEGA.
Our model is model agnostic and task agnostic so that it can be integrated into the existing ABSC methods or other tasks.
- Score: 23.05672636220897
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gradient-based explanation methods play an important role in the field of
interpreting complex deep neural networks for NLP models. However, the existing
work has shown that the gradients of a model are unstable and easily
manipulable, which impacts the model's reliability largely. According to our
preliminary analyses, we also find the interpretability of gradient-based
methods is limited for complex tasks, such as aspect-based sentiment
classification (ABSC). In this paper, we propose an
\textbf{I}nterpretation-\textbf{E}nhanced \textbf{G}radient-based framework for
\textbf{A}BSC via a small number of explanation annotations, namely
\texttt{{IEGA}}. Particularly, we first calculate the word-level saliency map
based on gradients to measure the importance of the words in the sentence
towards the given aspect. Then, we design a gradient correction module to
enhance the model's attention on the correct parts (e.g., opinion words). Our
model is model agnostic and task agnostic so that it can be integrated into the
existing ABSC methods or other tasks. Comprehensive experimental results on
four benchmark datasets show that our \texttt{IEGA} can improve not only the
interpretability of the model but also the performance and robustness.
Related papers
- A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA)
CEFA consists of a feature alignment module and a context enhancement module.
Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z) - Structure Your Data: Towards Semantic Graph Counterfactuals [1.8817715864806608]
Counterfactual explanations (CEs) based on concepts are explanations that consider alternative scenarios to understand which high-level semantic features contributed to model predictions.
In this work, we propose CEs based on the semantic graphs accompanying input data to achieve more descriptive, accurate, and human-aligned explanations.
arXiv Detail & Related papers (2024-03-11T08:40:37Z) - Learning Intrinsic Dimension via Information Bottleneck for Explainable
Aspect-based Sentiment Analysis [30.16902652669842]
We propose an Information Bottleneck-based Gradient (texttIBG) explanation framework for Aspect-based Sentiment Analysis (ABSA)
Our framework refines word embeddings into a concise intrinsic dimension, maintaining essential features and omitting unrelated information.
It considerably improves both the models' performance and interpretability by identifying sentiment-aware features.
arXiv Detail & Related papers (2024-02-28T08:11:05Z) - Generalizing Backpropagation for Gradient-Based Interpretability [103.2998254573497]
We show that the gradient of a model is a special case of a more general formulation using semirings.
This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics.
arXiv Detail & Related papers (2023-07-06T15:19:53Z) - A Unified Understanding of Deep NLP Models for Text Classification [88.35418976241057]
We have developed a visual analysis tool, DeepNLPVis, to enable a unified understanding of NLP models for text classification.
The key idea is a mutual information-based measure, which provides quantitative explanations on how each layer of a model maintains the information of input words in a sample.
A multi-level visualization, which consists of a corpus-level, a sample-level, and a word-level visualization, supports the analysis from the overall training set to individual samples.
arXiv Detail & Related papers (2022-06-19T08:55:07Z) - Locally Aggregated Feature Attribution on Natural Language Model
Understanding [12.233103741197334]
Locally Aggregated Feature Attribution (LAFA) is a novel gradient-based feature attribution method for NLP models.
Instead of relying on obscure reference tokens, it smooths gradients by aggregating similar reference texts derived from language model embeddings.
For evaluation purpose, we also design experiments on different NLP tasks including Entity Recognition and Sentiment Analysis on public datasets.
arXiv Detail & Related papers (2022-04-22T18:59:27Z) - Convex Polytope Modelling for Unsupervised Derivation of Semantic
Structure for Data-efficient Natural Language Understanding [31.888489552069146]
A Convex-Polytopic-Model-based framework shows great potential in automatically extracting semantic patterns by exploiting the raw dialog corpus.
We show that this framework can exploit semantic-frame-related features in the corpus, reveal the underlying semantic structure of the utterances, and boost the performance of the state-of-the-art NLU model with minimal supervision.
arXiv Detail & Related papers (2022-01-25T19:12:44Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Improving Aspect-based Sentiment Analysis with Gated Graph Convolutional
Networks and Syntax-based Regulation [89.38054401427173]
Aspect-based Sentiment Analysis (ABSA) seeks to predict the sentiment polarity of a sentence toward a specific aspect.
dependency trees can be integrated into deep learning models to produce the state-of-the-art performance for ABSA.
We propose a novel graph-based deep learning model to overcome these two issues.
arXiv Detail & Related papers (2020-10-26T07:36:24Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.