Saliency Maps are Ambiguous: Analysis of Logical Relations on First and Second Order Attributions
- URL: http://arxiv.org/abs/2501.14136v1
- Date: Thu, 23 Jan 2025 23:26:27 GMT
- Title: Saliency Maps are Ambiguous: Analysis of Logical Relations on First and Second Order Attributions
- Authors: Leonid Schwenke, Martin Atzmueller,
- Abstract summary: We show that saliency methods fail to grasp all needed classification information for all possible scenarios.
Specifically, this paper extends our previous work using analysis on more datasets, in order to better understand in which scenarios the saliency methods fail.
- Score: 0.11510009152620666
- License:
- Abstract: Recent work uncovered potential flaws in \eg attribution or heatmap based saliency methods. A typical flaw is a confirmations bias, where the scores are compared to human expectation. Since measuring the quality of saliency methods is hard due to missing ground truth model reasoning, finding general limitations is also hard. This is further complicated, because masking-based evaluation on complex data can easily introduce a bias, as most methods cannot fully ignore inputs. In this work, we extend our previous analysis on the logical dataset framework ANDOR, where we showed that all analysed saliency methods fail to grasp all needed classification information for all possible scenarios. Specifically, this paper extends our previous work using analysis on more datasets, in order to better understand in which scenarios the saliency methods fail. Further, we apply the Global Coherence Representation as an additional evaluation method in order to enable actual input omission.
Related papers
- Examining False Positives under Inference Scaling for Mathematical Reasoning [59.19191774050967]
This paper systematically examines the prevalence of false positive solutions in mathematical problem solving for language models.
We explore how false positives influence the inference time scaling behavior of language models.
arXiv Detail & Related papers (2025-02-10T07:49:35Z) - Saliency Methods are Encoders: Analysing Logical Relations Towards Interpretation [0.11510009152620666]
Saliency maps are often generated to improve explainability of neural network models.
This paper introduces a test for saliency map evaluation: proposing experiments based on all possible model reasonings over simple logical datasets.
Using the contained logical relationships, we aim to understand how different saliency methods treat information in different class discriminative scenarios.
Our results show that saliency methods can encode classification relevant information into the ordering of saliency scores.
arXiv Detail & Related papers (2024-12-17T08:55:17Z) - Towards Real World Debiasing: A Fine-grained Analysis On Spurious Correlation [17.080528126651977]
We revisit biased distributions in existing benchmarks and real-world datasets, and propose a fine-grained framework for analyzing dataset bias.
Results show that existing methods are incapable of handling real-world biases.
We propose a simple yet effective approach that can be easily applied to existing debias methods, named Debias in Destruction (DiD)
arXiv Detail & Related papers (2024-05-24T06:06:41Z) - Pre-training and Diagnosing Knowledge Base Completion Models [58.07183284468881]
We introduce and analyze an approach to knowledge transfer from one collection of facts to another without the need for entity or relation matching.
The main contribution is a method that can make use of large-scale pre-training on facts, which were collected from unstructured text.
To understand the obtained pre-trained models better, we then introduce a novel dataset for the analysis of pre-trained models for Open Knowledge Base Completion.
arXiv Detail & Related papers (2024-01-27T15:20:43Z) - Correcting Underrepresentation and Intersectional Bias for Classification [49.1574468325115]
We consider the problem of learning from data corrupted by underrepresentation bias.
We show that with a small amount of unbiased data, we can efficiently estimate the group-wise drop-out rates.
We show that our algorithm permits efficient learning for model classes of finite VC dimension.
arXiv Detail & Related papers (2023-06-19T18:25:44Z) - Testing for Overfitting [0.0]
We discuss the overfitting problem and explain why standard and concentration results do not hold for evaluation with training data.
We introduce and argue for a hypothesis test by means of which both model performance may be evaluated using training data.
arXiv Detail & Related papers (2023-05-09T22:49:55Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - Data Representing Ground-Truth Explanations to Evaluate XAI Methods [0.0]
Explainable artificial intelligence (XAI) methods are currently evaluated with approaches mostly originated in interpretable machine learning (IML) research.
We propose to represent explanations with canonical equations that can be used to evaluate the accuracy of XAI methods.
arXiv Detail & Related papers (2020-11-18T16:54:53Z) - A Critical Assessment of State-of-the-Art in Entity Alignment [1.7725414095035827]
We investigate two state-of-the-art (SotA) methods for the task of Entity Alignment in Knowledge Graphs.
We first carefully examine the benchmarking process and identify several shortcomings, which make the results reported in the original works not always comparable.
arXiv Detail & Related papers (2020-10-30T15:09:19Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.