Investigating the Impact of Model Instability on Explanations and Uncertainty
- URL: http://arxiv.org/abs/2402.13006v2
- Date: Tue, 4 Jun 2024 11:18:03 GMT
- Title: Investigating the Impact of Model Instability on Explanations and Uncertainty
- Authors: Sara Vera Marjanović, Isabelle Augenstein, Christina Lioma,
- Abstract summary: We simulate uncertainty in text input by introducing noise at inference time.
We find that high uncertainty doesn't necessarily imply low explanation plausibility.
This suggests that noise-augmented models may be better at identifying salient tokens when uncertain.
- Score: 43.254616360807496
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Explainable AI methods facilitate the understanding of model behaviour, yet, small, imperceptible perturbations to inputs can vastly distort explanations. As these explanations are typically evaluated holistically, before model deployment, it is difficult to assess when a particular explanation is trustworthy. Some studies have tried to create confidence estimators for explanations, but none have investigated an existing link between uncertainty and explanation quality. We artificially simulate epistemic uncertainty in text input by introducing noise at inference time. In this large-scale empirical study, we insert different levels of noise perturbations and measure the effect on the output of pre-trained language models and different uncertainty metrics. Realistic perturbations have minimal effect on performance and explanations, yet masking has a drastic effect. We find that high uncertainty doesn't necessarily imply low explanation plausibility; the correlation between the two metrics can be moderately positive when noise is exposed during the training process. This suggests that noise-augmented models may be better at identifying salient tokens when uncertain. Furthermore, when predictive and epistemic uncertainty measures are over-confident, the robustness of a saliency map to perturbation can indicate model stability issues. Integrated Gradients shows the overall greatest robustness to perturbation, while still showing model-specific patterns in performance; however, this phenomenon is limited to smaller Transformer-based language models.
Related papers
- Extreme Miscalibration and the Illusion of Adversarial Robustness [66.29268991629085]
Adversarial Training is often used to increase model robustness.
We show that this observed gain in robustness is an illusion of robustness (IOR)
We urge the NLP community to incorporate test-time temperature scaling into their robustness evaluations.
arXiv Detail & Related papers (2024-02-27T13:49:12Z) - Identifying Drivers of Predictive Aleatoric Uncertainty [2.5311562666866494]
We present a simple approach to explain predictive aleatoric uncertainties.
We estimate uncertainty as predictive variance by adapting a neural network with a Gaussian output distribution.
We quantify our findings with a nuanced benchmark analysis that includes real-world datasets.
arXiv Detail & Related papers (2023-12-12T13:28:53Z) - Improving the Reliability of Large Language Models by Leveraging
Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination"
We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z) - ALUM: Adversarial Data Uncertainty Modeling from Latent Model
Uncertainty Compensation [25.67258563807856]
We propose a novel method called ALUM to handle the model uncertainty and data uncertainty in a unified scheme.
Our proposed ALUM is model-agnostic which can be easily implemented into any existing deep model with little extra overhead.
arXiv Detail & Related papers (2023-03-29T17:24:12Z) - Adversarial Counterfactual Visual Explanations [0.7366405857677227]
This paper proposes an elegant method to turn adversarial attacks into semantically meaningful perturbations.
The proposed approach hypothesizes that Denoising Diffusion Probabilistic Models are excellent regularizers for avoiding high-frequency and out-of-distribution perturbations.
arXiv Detail & Related papers (2023-03-17T13:34:38Z) - Uncertainty-Aware Learning Against Label Noise on Imbalanced Datasets [23.4536532321199]
We propose an Uncertainty-aware Label Correction framework to handle label noise on imbalanced datasets.
Inspired by our observations, we propose an Uncertainty-aware Label Correction framework to handle label noise on imbalanced datasets.
arXiv Detail & Related papers (2022-07-12T11:35:55Z) - Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature.
We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance.
By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z) - Dense Uncertainty Estimation via an Ensemble-based Conditional Latent
Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model.
We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation.
Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Uncertainty-Aware Reliable Text Classification [21.517852608625127]
Deep neural networks have significantly contributed to the success in predictive accuracy for classification tasks.
They tend to make over-confident predictions in real-world settings, where domain shifting and out-of-distribution examples exist.
We propose an inexpensive framework that adopts both auxiliary outliers and pseudo off-manifold samples to train the model with prior knowledge of a certain class.
arXiv Detail & Related papers (2021-07-15T04:39:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.