Dynamic Top-k Estimation Consolidates Disagreement between Feature
Attribution Methods
- URL: http://arxiv.org/abs/2310.05619v2
- Date: Fri, 3 Nov 2023 12:11:17 GMT
- Title: Dynamic Top-k Estimation Consolidates Disagreement between Feature
Attribution Methods
- Authors: Jonathan Kamp, Lisa Beinborn, Antske Fokkens
- Abstract summary: We find that perturbation-based methods and Vanilla Gradient exhibit highest agreement on most method--method and method--human agreement metrics with a static k.
This is the first evidence that sequential properties of attribution scores are informative for consolidating attribution signals for human interpretation.
- Score: 5.202524136984542
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Feature attribution scores are used for explaining the prediction of a text
classifier to users by highlighting a k number of tokens. In this work, we
propose a way to determine the number of optimal k tokens that should be
displayed from sequential properties of the attribution scores. Our approach is
dynamic across sentences, method-agnostic, and deals with sentence length bias.
We compare agreement between multiple methods and humans on an NLI task, using
fixed k and dynamic k. We find that perturbation-based methods and Vanilla
Gradient exhibit highest agreement on most method--method and method--human
agreement metrics with a static k. Their advantage over other methods
disappears with dynamic ks which mainly improve Integrated Gradient and
GradientXInput. To our knowledge, this is the first evidence that sequential
properties of attribution scores are informative for consolidating attribution
signals for human interpretation.
Related papers
- Hierarchical Indexing for Retrieval-Augmented Opinion Summarization [60.5923941324953]
We propose a method for unsupervised abstractive opinion summarization that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs)
Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy.
At inference time, we populate the index and use it to identify and retrieve clusters of sentences containing popular opinions from input reviews.
arXiv Detail & Related papers (2024-03-01T10:38:07Z) - DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z) - Enhancing Coherence of Extractive Summarization with Multitask Learning [40.349019691412465]
This study proposes a multitask learning architecture for extractive summarization with coherence boosting.
The architecture contains an extractive summarizer and coherent discriminator module.
Experiments show that our proposed method significantly improves the proportion of consecutive sentences in the extracted summaries.
arXiv Detail & Related papers (2023-05-22T09:20:58Z) - Retrieval-Augmented Classification with Decoupled Representation [31.662843145399044]
We propose a $k$-nearest-neighbor (KNN)-based method for retrieval augmented classifications.
We find that shared representation for classification and retrieval hurts performance and leads to training instability.
We evaluate our method on a wide range of classification datasets.
arXiv Detail & Related papers (2023-03-23T06:33:06Z) - Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency
Methods [0.15039745292757667]
We show that saliency methods exhibit weak rank correlations even when applied to the same model instance.
Regularization techniques that increase faithfulness of attention explanations also increase agreement between saliency methods.
arXiv Detail & Related papers (2022-11-15T18:18:34Z) - Concrete Score Matching: Generalized Score Matching for Discrete Data [109.12439278055213]
"Concrete score" is a generalization of the (Stein) score for discrete settings.
"Concrete Score Matching" is a framework to learn such scores from samples.
arXiv Detail & Related papers (2022-11-02T00:41:37Z) - Pruned Graph Neural Network for Short Story Ordering [0.7087237546722617]
Organizing sentences into an order that maximizes coherence is known as sentence ordering.
We propose a new method for constructing sentence-entity graphs of short stories to create the edges between sentences.
We also observe that replacing pronouns with their referring entities effectively encodes sentences in sentence-entity graphs.
arXiv Detail & Related papers (2022-03-13T22:25:17Z) - Sequential Recommendation via Stochastic Self-Attention [68.52192964559829]
Transformer-based approaches embed items as vectors and use dot-product self-attention to measure the relationship between items.
We propose a novel textbfSTOchastic textbfSelf-textbfAttention(STOSA) to overcome these issues.
We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences.
arXiv Detail & Related papers (2022-01-16T12:38:45Z) - Variable Instance-Level Explainability for Text Classification [9.147707153504117]
We propose a method for extracting variable-length explanations using a set of different feature scoring methods at instance-level.
Our method consistently provides more faithful explanations compared to previous fixed-length and fixed-feature scoring methods for rationale extraction.
arXiv Detail & Related papers (2021-04-16T16:53:48Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.