Why Would You Suggest That? Human Trust in Language Model Responses
- URL: http://arxiv.org/abs/2406.02018v2
- Date: Fri, 04 Oct 2024 16:46:00 GMT
- Title: Why Would You Suggest That? Human Trust in Language Model Responses
- Authors: Manasi Sharma, Ho Chit Siu, Rohan Paleja, Jaime D. Peña,
- Abstract summary: We analyze how the framing and presence of explanations affect user trust and model performance.
Our findings urge future research to delve deeper into the nuanced evaluation of trust in human-machine teaming systems.
- Score: 0.3749861135832073
- License:
- Abstract: The emergence of Large Language Models (LLMs) has revealed a growing need for human-AI collaboration, especially in creative decision-making scenarios where trust and reliance are paramount. Through human studies and model evaluations on the open-ended News Headline Generation task from the LaMP benchmark, we analyze how the framing and presence of explanations affect user trust and model performance. Overall, we provide evidence that adding an explanation in the model response to justify its reasoning significantly increases self-reported user trust in the model when the user has the opportunity to compare various responses. Position and faithfulness of these explanations are also important factors. However, these gains disappear when users are shown responses independently, suggesting that humans trust all model responses, including deceptive ones, equitably when they are shown in isolation. Our findings urge future research to delve deeper into the nuanced evaluation of trust in human-machine teaming systems.
Related papers
- Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models [14.5291643644017]
We introduce the concept of Confidence-Probability Alignment.
We probe the alignment between models' internal and expressed confidence.
Among the models analyzed, OpenAI's GPT-4 showed the strongest confidence-probability alignment.
arXiv Detail & Related papers (2024-05-25T15:42:04Z) - Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - RELIC: Investigating Large Language Model Responses using Self-Consistency [58.63436505595177]
Large Language Models (LLMs) are notorious for blending fact with fiction and generating non-factual content, known as hallucinations.
We propose an interactive system that helps users gain insight into the reliability of the generated text.
arXiv Detail & Related papers (2023-11-28T14:55:52Z) - What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception [53.4840989321394]
We analyze the effect of rationales generated by QA models to support their answers.
We present users with incorrect answers and corresponding rationales in various formats.
We measure the effectiveness of this feedback in patching these rationales through in-context learning.
arXiv Detail & Related papers (2023-11-16T04:26:32Z) - When Large Language Models contradict humans? Large Language Models' Sycophantic Behaviour [0.8133739801185272]
We show that Large Language Models (LLMs) show sycophantic tendencies when responding to queries involving subjective opinions and statements.
LLMs at various scales seem not to follow the users' hints by demonstrating confidence in delivering the correct answers.
arXiv Detail & Related papers (2023-11-15T22:18:33Z) - Measuring the Effect of Influential Messages on Varying Personas [67.1149173905004]
We present a new task, Response Forecasting on Personas for News Media, to estimate the response a persona might have upon seeing a news message.
The proposed task not only introduces personalization in the modeling but also predicts the sentiment polarity and intensity of each response.
This enables more accurate and comprehensive inference on the mental state of the persona.
arXiv Detail & Related papers (2023-05-25T21:01:00Z) - Why not both? Complementing explanations with uncertainty, and the role
of self-confidence in Human-AI collaboration [12.47276164048813]
We conduct an empirical study to identify how uncertainty estimates and model explanations affect users' reliance, understanding, and trust towards a model.
We also discuss how the latter may distort the outcome of an analysis based on agreement and switching percentages.
arXiv Detail & Related papers (2023-04-27T12:24:33Z) - Explaining Model Confidence Using Counterfactuals [4.385390451313721]
Displaying confidence scores in human-AI interaction has been shown to help build trust between humans and AI systems.
Most existing research uses only the confidence score as a form of communication.
We show that counterfactual explanations of confidence scores help study participants to better understand and better trust a machine learning model's prediction.
arXiv Detail & Related papers (2023-03-10T06:22:13Z) - Improving Model Understanding and Trust with Counterfactual Explanations
of Model Confidence [4.385390451313721]
Showing confidence scores in human-agent interaction systems can help build trust between humans and AI systems.
Most existing research only used the confidence score as a form of communication.
This paper presents two methods for understanding model confidence using counterfactual explanation.
arXiv Detail & Related papers (2022-06-06T04:04:28Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - Improving Factual Consistency Between a Response and Persona Facts [64.30785349238619]
Neural models for response generation produce responses that are semantically plausible but not necessarily factually consistent with facts describing the speaker's persona.
We propose to fine-tune these models by reinforcement learning and an efficient reward function that explicitly captures the consistency between a response and persona facts as well as semantic plausibility.
arXiv Detail & Related papers (2020-04-30T18:08:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.