A Bayesian Account of Measures of Interpretability in Human-AI
Interaction
- URL: http://arxiv.org/abs/2011.10920v1
- Date: Sun, 22 Nov 2020 03:28:28 GMT
- Title: A Bayesian Account of Measures of Interpretability in Human-AI
Interaction
- Authors: Sarath Sreedharan, Anagha Kulkarni, Tathagata Chakraborti, David E.
Smith and Subbarao Kambhampati
- Abstract summary: Existing approaches for the design of interpretable agent behavior consider different measures of interpretability in isolation.
We propose a revised model where all these behaviors can be meaningfully modeled together.
We will highlight interesting consequences of this unified model and motivate, through results of a user study.
- Score: 34.99424576619341
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing approaches for the design of interpretable agent behavior consider
different measures of interpretability in isolation. In this paper we posit
that, in the design and deployment of human-aware agents in the real world,
notions of interpretability are just some among many considerations; and the
techniques developed in isolation lack two key properties to be useful when
considered together: they need to be able to 1) deal with their mutually
competing properties; and 2) an open world where the human is not just there to
interpret behavior in one specific form. To this end, we consider three
well-known instances of interpretable behavior studied in existing literature
-- namely, explicability, legibility, and predictability -- and propose a
revised model where all these behaviors can be meaningfully modeled together.
We will highlight interesting consequences of this unified model and motivate,
through results of a user study, why this revision is necessary.
Related papers
- On the Fairness, Diversity and Reliability of Text-to-Image Generative Models [49.60774626839712]
multimodal generative models have sparked critical discussions on their fairness, reliability, and potential for misuse.
We propose an evaluation framework designed to assess model reliability through their responses to perturbations in the embedding space.
Our method lays the groundwork for detecting unreliable, bias-injected models and retrieval of bias provenance.
arXiv Detail & Related papers (2024-11-21T09:46:55Z) - InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts [31.738009841932374]
Interpretability for neural networks is a trade-off between three key requirements.
We present InterpretCC, a family of interpretable-by-design neural networks that guarantee human-centric interpretability.
arXiv Detail & Related papers (2024-02-05T11:55:50Z) - Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable.
Existing susceptibility studies heavily rely on self-reported beliefs.
We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z) - Inherent Inconsistencies of Feature Importance [6.02357145653815]
Feature importance is a method that assigns scores to the contribution of individual features on prediction outcomes.
This paper presents an axiomatic framework designed to establish coherent relationships among the different contexts of feature importance scores.
arXiv Detail & Related papers (2022-06-16T14:21:51Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - Cross-model Fairness: Empirical Study of Fairness and Ethics Under Model Multiplicity [10.144058870887061]
We argue that individuals can be harmed when one predictor is chosen ad hoc from a group of equally well performing models.
Our findings suggest that such unfairness can be readily found in real life and it may be difficult to mitigate by technical means alone.
arXiv Detail & Related papers (2022-03-14T14:33:39Z) - Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards
Individualized and Explainable Robotic Support in Everyday Activities [80.37857025201036]
Key challenge for robotic systems is to figure out the behavior of another agent.
Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally.
We propose equipping robots with the necessary tools to conduct observational studies on people.
arXiv Detail & Related papers (2022-01-27T22:15:56Z) - A Unifying Bayesian Formulation of Measures of Interpretability in
Human-AI [25.239891076153025]
We present a unifying Bayesian framework that models a human observer's evolving beliefs about an agent.
We show that the definitions of interpretability measures like explicability, legibility and predictability from the prior literature fall out as special cases of our general framework.
arXiv Detail & Related papers (2021-04-21T20:06:33Z) - Machine Common Sense [77.34726150561087]
Machine common sense remains a broad, potentially unbounded problem in artificial intelligence (AI)
This article deals with the aspects of modeling commonsense reasoning focusing on such domain as interpersonal interactions.
arXiv Detail & Related papers (2020-06-15T13:59:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.