On the Definition of Appropriate Trust and the Tools that Come with it
- URL: http://arxiv.org/abs/2309.11937v1
- Date: Thu, 21 Sep 2023 09:52:06 GMT
- Title: On the Definition of Appropriate Trust and the Tools that Come with it
- Authors: Helena L\"ofstr\"om
- Abstract summary: This paper starts with the definitions of appropriate trust from the literature.
It compares the definitions with model performance evaluation, showing the strong similarities between appropriate trust and model performance evaluation.
The paper offers several straightforward evaluation methods for different aspects of user performance, including suggesting a method for measuring uncertainty and appropriate trust in regression.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Evaluating the efficiency of human-AI interactions is challenging, including
subjective and objective quality aspects. With the focus on the human
experience of the explanations, evaluations of explanation methods have become
mostly subjective, making comparative evaluations almost impossible and highly
linked to the individual user. However, it is commonly agreed that one aspect
of explanation quality is how effectively the user can detect if the
predictions are trustworthy and correct, i.e., if the explanations can increase
the user's appropriate trust in the model. This paper starts with the
definitions of appropriate trust from the literature. It compares the
definitions with model performance evaluation, showing the strong similarities
between appropriate trust and model performance evaluation. The paper's main
contribution is a novel approach to evaluating appropriate trust by taking
advantage of the likenesses between definitions. The paper offers several
straightforward evaluation methods for different aspects of user performance,
including suggesting a method for measuring uncertainty and appropriate trust
in regression.
Related papers
- Automated Trustworthiness Testing for Machine Learning Classifiers [3.3423762257383207]
This paper proposes TOWER, the first technique to automatically create trustworthiness oracles that determine whether text classifier predictions are trustworthy.
Our hypothesis is that a prediction is trustworthy if the words in its explanation are semantically related to the predicted class.
The results show that TOWER can detect a decrease in trustworthiness as noise increases, but is not effective when evaluated against the human-labeled dataset.
arXiv Detail & Related papers (2024-06-07T20:25:05Z) - ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models [53.00812898384698]
We argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking.
We highlight how cognitive biases can conflate fluent information and truthfulness, and how cognitive uncertainty affects the reliability of rating scores such as Likert.
We propose the ConSiDERS-The-Human evaluation framework consisting of 6 pillars -- Consistency, Scoring Criteria, Differentiating, User Experience, Responsible, and Scalability.
arXiv Detail & Related papers (2024-05-28T22:45:28Z) - Backdoor-based Explainable AI Benchmark for High Fidelity Evaluation of Attribution Methods [49.62131719441252]
Attribution methods compute importance scores for input features to explain the output predictions of deep models.
In this work, we first identify a set of fidelity criteria that reliable benchmarks for attribution methods are expected to fulfill.
We then introduce a Backdoor-based eXplainable AI benchmark (BackX) that adheres to the desired fidelity criteria.
arXiv Detail & Related papers (2024-05-02T13:48:37Z) - FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets [69.91340332545094]
We introduce FLASK, a fine-grained evaluation protocol for both human-based and model-based evaluation.
We experimentally observe that the fine-graininess of evaluation is crucial for attaining a holistic view of model performance.
arXiv Detail & Related papers (2023-07-20T14:56:35Z) - Trust, but Verify: Using Self-Supervised Probing to Improve
Trustworthiness [29.320691367586004]
We introduce a new approach of self-supervised probing, which enables us to check and mitigate the overconfidence issue for a trained model.
We provide a simple yet effective framework, which can be flexibly applied to existing trustworthiness-related methods in a plug-and-play manner.
arXiv Detail & Related papers (2023-02-06T08:57:20Z) - Improving Model Understanding and Trust with Counterfactual Explanations
of Model Confidence [4.385390451313721]
Showing confidence scores in human-agent interaction systems can help build trust between humans and AI systems.
Most existing research only used the confidence score as a form of communication.
This paper presents two methods for understanding model confidence using counterfactual explanation.
arXiv Detail & Related papers (2022-06-06T04:04:28Z) - Personalized multi-faceted trust modeling to determine trust links in
social media and its potential for misinformation management [61.88858330222619]
We present an approach for predicting trust links between peers in social media.
We propose a data-driven multi-faceted trust modeling which incorporates many distinct features for a comprehensive analysis.
Illustrated in a trust-aware item recommendation task, we evaluate the proposed framework in the context of a large Yelp dataset.
arXiv Detail & Related papers (2021-11-11T19:40:51Z) - Investigating Crowdsourcing Protocols for Evaluating the Factual
Consistency of Summaries [59.27273928454995]
Current pre-trained models applied to summarization are prone to factual inconsistencies which misrepresent the source text or introduce extraneous information.
We create a crowdsourcing evaluation framework for factual consistency using the rating-based Likert scale and ranking-based Best-Worst Scaling protocols.
We find that ranking-based protocols offer a more reliable measure of summary quality across datasets, while the reliability of Likert ratings depends on the target dataset and the evaluation design.
arXiv Detail & Related papers (2021-09-19T19:05:00Z) - On the Interaction of Belief Bias and Explanations [4.211128681972148]
We provide an overview of belief bias, its role in human evaluation, and ideas for NLP practitioners on how to account for it.
We show that conclusions about the highest performing methods change when introducing such controls, pointing to the importance of accounting for belief bias in evaluation.
arXiv Detail & Related papers (2021-06-29T12:49:42Z) - Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis.
We obtain new explanations that are loosely necessary and sufficient for a prediction.
We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z) - What's a Good Prediction? Challenges in evaluating an agent's knowledge [0.9281671380673306]
We show the conflict between accuracy and usefulness of general knowledge.
We propose an alternate evaluation approach that arises continually in the online continual learning setting.
This paper contributes a first look into evaluation of predictions through their use.
arXiv Detail & Related papers (2020-01-23T21:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.