Diverse, Global and Amortised Counterfactual Explanations for
Uncertainty Estimates
- URL: http://arxiv.org/abs/2112.02646v3
- Date: Thu, 9 Dec 2021 01:48:15 GMT
- Title: Diverse, Global and Amortised Counterfactual Explanations for
Uncertainty Estimates
- Authors: Dan Ley, Umang Bhatt, Adrian Weller
- Abstract summary: We study the diversity of such sets and find that many CLUEs are redundant.
We then propose GLobal AMortised CLUE (GLAM-CLUE), a distinct and novel method which learns amortised mappings on specific groups of uncertain inputs.
Our experiments show that $delta$-CLUE, $nabla$-CLUE, and GLAM-CLUE all address shortcomings of CLUE and provide beneficial explanations of uncertainty estimates to practitioners.
- Score: 31.241489953967694
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To interpret uncertainty estimates from differentiable probabilistic models,
recent work has proposed generating a single Counterfactual Latent Uncertainty
Explanation (CLUE) for a given data point where the model is uncertain,
identifying a single, on-manifold change to the input such that the model
becomes more certain in its prediction. We broaden the exploration to examine
$\delta$-CLUE, the set of potential CLUEs within a $\delta$ ball of the
original input in latent space. We study the diversity of such sets and find
that many CLUEs are redundant; as such, we propose DIVerse CLUE
($\nabla$-CLUE), a set of CLUEs which each propose a distinct explanation as to
how one can decrease the uncertainty associated with an input. We then further
propose GLobal AMortised CLUE (GLAM-CLUE), a distinct and novel method which
learns amortised mappings on specific groups of uncertain inputs, taking them
and efficiently transforming them in a single function call into inputs for
which a model will be certain. Our experiments show that $\delta$-CLUE,
$\nabla$-CLUE, and GLAM-CLUE all address shortcomings of CLUE and provide
beneficial explanations of uncertainty estimates to practitioners.
Related papers
- CLUE: Concept-Level Uncertainty Estimation for Large Language Models [49.92690111618016]
We propose a novel framework for Concept-Level Uncertainty Estimation for Large Language Models (LLMs)
We leverage LLMs to convert output sequences into concept-level representations, breaking down sequences into individual concepts and measuring the uncertainty of each concept separately.
We conduct experiments to demonstrate that CLUE can provide more interpretable uncertainty estimation results compared with sentence-level uncertainty.
arXiv Detail & Related papers (2024-09-04T18:27:12Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Language Model Cascades: Token-level uncertainty and beyond [65.38515344964647]
Recent advances in language models (LMs) have led to significant improvements in quality on complex NLP tasks.
Cascading offers a simple strategy to achieve more favorable cost-quality tradeoffs.
We show that incorporating token-level uncertainty through learned post-hoc deferral rules can significantly outperform simple aggregation strategies.
arXiv Detail & Related papers (2024-04-15T21:02:48Z) - Invariant Causal Prediction with Local Models [52.161513027831646]
We consider the task of identifying the causal parents of a target variable among a set of candidates from observational data.
We introduce a practical method called L-ICP ($textbfL$ocalized $textbfI$nvariant $textbfCa$usal $textbfP$rediction), which is based on a hypothesis test for parent identification using a ratio of minimum and maximum statistics.
arXiv Detail & Related papers (2024-01-10T15:34:42Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z) - Understanding Contrastive Learning via Distributionally Robust
Optimization [29.202594242468678]
This study reveals the inherent tolerance of contrastive learning (CL) towards sampling bias, wherein negative samples may encompass similar semantics (eg labels)
We bridge this research gap by analyzing CL through the lens of distributionally robust optimization (DRO), yielding several key insights.
We also identify CL's potential shortcomings, including over-conservatism and sensitivity to outliers, and introduce a novel Adjusted InfoNCE loss (ADNCE) to mitigate these issues.
arXiv Detail & Related papers (2023-10-17T07:32:59Z) - Flexible and Robust Counterfactual Explanations with Minimal Satisfiable
Perturbations [56.941276017696076]
We propose a conceptually simple yet effective solution named Counterfactual Explanations with Minimal Satisfiable Perturbations (CEMSP)
CEMSP constrains changing values of abnormal features with the help of their semantically meaningful normal ranges.
Compared to existing methods, we conduct comprehensive experiments on both synthetic and real-world datasets to demonstrate that our method provides more robust explanations while preserving flexibility.
arXiv Detail & Related papers (2023-09-09T04:05:56Z) - Environment Invariant Linear Least Squares [18.387614531869826]
This paper considers a multi-environment linear regression model in which data from multiple experimental settings are collected.
We construct a novel environment invariant linear least squares (EILLS) objective function, a multi-environment version of linear least-squares regression.
arXiv Detail & Related papers (2023-03-06T13:10:54Z) - Open-Set Likelihood Maximization for Few-Shot Learning [36.97433312193586]
We tackle the Few-Shot Open-Set Recognition (FSOSR) problem, i.e. classifying instances among a set of classes for which we only have a few labeled samples.
We explore the popular transductive setting, which leverages the unlabelled query instances at inference.
Motivated by the observation that existing transductive methods perform poorly in open-set scenarios, we propose a generalization of the maximum likelihood principle.
arXiv Detail & Related papers (2023-01-20T01:56:19Z) - $\delta$-CLUE: Diverse Sets of Explanations for Uncertainty Estimates [31.241489953967694]
We augment the original CLUE approach, to provide what we call $delta$-CLUE.
We instead return a $itset$ of plausible CLUEs: multiple, diverse inputs that are within a $delta$ ball of the original input in latent space.
arXiv Detail & Related papers (2021-04-13T16:03:27Z) - Getting a CLUE: A Method for Explaining Uncertainty Estimates [30.367995696223726]
We propose a novel method for interpreting uncertainty estimates from differentiable probabilistic models.
Our method, Counterfactual Latent Uncertainty Explanations (CLUE), indicates how to change an input, while keeping it on the data manifold.
arXiv Detail & Related papers (2020-06-11T21:53:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.