Tell Me Why: Incentivizing Explanations
- URL: http://arxiv.org/abs/2502.13410v1
- Date: Wed, 19 Feb 2025 03:47:34 GMT
- Title: Tell Me Why: Incentivizing Explanations
- Authors: Siddarth Srinivasan, Ezra Karger, Michiel Bakker, Yiling Chen,
- Abstract summary: There is no known mechanism that provides incentives to elicit explanations for beliefs from agents.
Standard Bayesian models make assumptions that preempt the need for explanations.
This work argues that rationales-explanations of an agent's private information-lead to more efficient aggregation.
- Score: 3.2754470919268543
- License:
- Abstract: Common sense suggests that when individuals explain why they believe something, we can arrive at more accurate conclusions than when they simply state what they believe. Yet, there is no known mechanism that provides incentives to elicit explanations for beliefs from agents. This likely stems from the fact that standard Bayesian models make assumptions (like conditional independence of signals) that preempt the need for explanations, in order to show efficient information aggregation. A natural justification for the value of explanations is that agents' beliefs tend to be drawn from overlapping sources of information, so agents' belief reports do not reveal all that needs to be known. Indeed, this work argues that rationales-explanations of an agent's private information-lead to more efficient aggregation by allowing agents to efficiently identify what information they share and what information is new. Building on this model of rationales, we present a novel 'deliberation mechanism' to elicit rationales from agents in which truthful reporting of beliefs and rationales is a perfect Bayesian equilibrium.
Related papers
- Counterfactual Explanations as Plans [6.445239204595516]
We look to provide a formal account of counterfactual explanations," based in terms of action sequences.
We then show that this naturally leads to an account of model reconciliation, which might take the form of the user correcting the agent's model, or suggesting actions to the agent's plan.
arXiv Detail & Related papers (2025-02-13T11:45:54Z) - Uncommon Belief in Rationality [23.98373492872004]
This paper proposes a graph-based language for capturing higher-order beliefs that agents might have about the rationality of the other agents.
The two main contributions are a solution concept that captures the reasoning process based on a given belief structure and an efficient algorithm for compressing any belief structure into a unique minimal form.
arXiv Detail & Related papers (2024-12-12T16:12:40Z) - Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance [4.962171160815189]
We introduce the notion of dissenting explanations: conflicting predictions with accompanying explanations.
We first explore the advantage of dissenting explanations in the setting of model multiplicity.
We demonstrate that dissenting explanations reduce overreliance on model predictions, without reducing overall accuracy.
arXiv Detail & Related papers (2023-07-14T21:27:00Z) - Eliminating The Impossible, Whatever Remains Must Be True [46.39428193548396]
We show how one can apply background knowledge to give more succinct "why" formal explanations.
We also show how to use existing rule induction techniques to efficiently extract background information from a dataset.
arXiv Detail & Related papers (2022-06-20T03:18:14Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - Features of Explainability: How users understand counterfactual and
causal explanations for categorical and continuous features in XAI [10.151828072611428]
Counterfactual explanations are increasingly used to address interpretability, recourse, and bias in AI decisions.
We tested the effects of counterfactual and causal explanations on the objective accuracy of users predictions.
We also found that users understand explanations referring to categorical features more readily than those referring to continuous features.
arXiv Detail & Related papers (2022-04-21T15:01:09Z) - Properties from Mechanisms: An Equivariance Perspective on Identifiable
Representation Learning [79.4957965474334]
Key goal of unsupervised representation learning is "inverting" a data generating process to recover its latent properties.
This paper asks, "Can we instead identify latent properties by leveraging knowledge of the mechanisms that govern their evolution?"
We provide a complete characterization of the sources of non-identifiability as we vary knowledge about a set of possible mechanisms.
arXiv Detail & Related papers (2021-10-29T14:04:08Z) - Are Training Resources Insufficient? Predict First Then Explain! [54.184609286094044]
We argue that the predict-then-explain (PtE) architecture is a more efficient approach in terms of the modelling perspective.
We show that the PtE structure is the most data-efficient approach when explanation data are lacking.
arXiv Detail & Related papers (2021-08-29T07:04:50Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z) - Empirically Verifying Hypotheses Using Reinforcement Learning [58.09414653169534]
This paper formulates hypothesis verification as an RL problem.
We aim to build an agent that, given a hypothesis about the dynamics of the world, can take actions to generate observations which can help predict whether the hypothesis is true or false.
arXiv Detail & Related papers (2020-06-29T01:01:10Z) - Towards the Role of Theory of Mind in Explanation [23.818659473644505]
Theory of Mind is the ability to attribute mental states (e.g., beliefs, goals) to oneself, and to others.
Previous work has observed that Theory of Mind capabilities are central to providing an explanation to another agent.
arXiv Detail & Related papers (2020-05-06T17:13:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.