Fool Me, Fool Me: User Attitudes Toward LLM Falsehoods
- URL: http://arxiv.org/abs/2412.11625v1
- Date: Mon, 16 Dec 2024 10:10:27 GMT
- Title: Fool Me, Fool Me: User Attitudes Toward LLM Falsehoods
- Authors: Diana Bar-Or Nirman, Ariel Weizman, Amos Azaria,
- Abstract summary: This study examines user preferences regarding falsehood responses from Large Language Models (LLMs)<n>Surprisingly, 61% of users prefer unmarked falsehood responses over marked ones.<n>These findings suggest that user preferences, which influence LLM training via feedback mechanisms, may inadvertently encourage the generation of falsehoods.
- Score: 13.62116438805314
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: While Large Language Models (LLMs) have become central tools in various fields, they often provide inaccurate or false information. This study examines user preferences regarding falsehood responses from LLMs. Specifically, we evaluate preferences for LLM responses where false statements are explicitly marked versus unmarked responses and preferences for confident falsehoods compared to LLM disclaimers acknowledging a lack of knowledge. Additionally, we investigate how requiring users to assess the truthfulness of statements influences these preferences. Surprisingly, 61\% of users prefer unmarked falsehood responses over marked ones, and 69\% prefer confident falsehoods over LLMs admitting lack of knowledge. In all our experiments, a total of 300 users participated, contributing valuable data to our analysis and conclusions. When users are required to evaluate the truthfulness of statements, preferences for unmarked and falsehood responses decrease slightly but remain high. These findings suggest that user preferences, which influence LLM training via feedback mechanisms, may inadvertently encourage the generation of falsehoods. Future research should address the ethical and practical implications of aligning LLM behavior with such preferences.
Related papers
- LLMs Struggle to Reject False Presuppositions when Misinformation Stakes are High [7.9042053398943075]
Presuppositions subtly introduce information as given, making them highly effective at embedding disputable or false information.<n>This raises concerns about whether LLMs, like humans, may fail to detect and correct misleading assumptions introduced as false presuppositions.
arXiv Detail & Related papers (2025-05-28T13:35:07Z) - Bayesian Teaching Enables Probabilistic Reasoning in Large Language Models [50.16340812031201]
We show that large language models (LLMs) do not update their beliefs as expected from the Bayesian framework.
We teach the LLMs to reason in a Bayesian manner by training them to mimic the predictions of an optimal Bayesian model.
arXiv Detail & Related papers (2025-03-21T20:13:04Z) - How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation [24.355564722047244]
Large Language Models (LLMs) are widely deployed in diverse scenarios.
The extent to which they could tacitly spread misinformation emerges as a critical safety concern.
We curated ECHOMIST, the first benchmark for implicit misinformation.
arXiv Detail & Related papers (2025-03-12T17:59:18Z) - Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning [79.48839334040197]
Instruction fine-tuning (IFT) can increase the informativeness of large language models (LLMs), but may reduce their truthfulness.<n>In this paper, we empirically demonstrate how unfamiliar knowledge in IFT datasets can negatively affect the truthfulness of LLMs.<n>We introduce two new IFT paradigms, $UNIT_cut$ and $UNIT_ref$, to address this issue.
arXiv Detail & Related papers (2025-02-17T16:10:30Z) - Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies [66.30619782227173]
Large language models (LLMs) can produce erroneous responses that sound fluent and convincing.
We identify several features of LLM responses that shape users' reliance.
We find that explanations increase reliance on both correct and incorrect responses.
We observe less reliance on incorrect responses when sources are provided or when explanations exhibit inconsistencies.
arXiv Detail & Related papers (2025-02-12T16:35:41Z) - Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions [45.04582353648683]
We propose to assign preference labels by simulating expected outcomes in the future turns.
This allows LLMs to learn to ask clarifying questions when it can generate responses tailored to each user interpretation in future turns.
We evaluate systems based on their ability to ask clarifying questions that can recover each user's interpretation and expected answer.
arXiv Detail & Related papers (2024-10-17T17:29:04Z) - CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models [60.59638232596912]
We introduce CLAMBER, a benchmark for evaluating large language models (LLMs)
Building upon the taxonomy, we construct 12K high-quality data to assess the strengths, weaknesses, and potential risks of various off-the-shelf LLMs.
Our findings indicate the limited practical utility of current LLMs in identifying and clarifying ambiguous user queries.
arXiv Detail & Related papers (2024-05-20T14:34:01Z) - "I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust [51.542856739181474]
We show how different natural language expressions of uncertainty impact participants' reliance, trust, and overall task performance.
We find that first-person expressions decrease participants' confidence in the system and tendency to agree with the system's answers, while increasing participants' accuracy.
Our findings suggest that using natural language expressions of uncertainty may be an effective approach for reducing overreliance on LLMs, but that the precise language used matters.
arXiv Detail & Related papers (2024-05-01T16:43:55Z) - Truth-Aware Context Selection: Mitigating Hallucinations of Large Language Models Being Misled by Untruthful Contexts [31.769428095250912]
Large Language Models (LLMs) are easily misled by untruthful contexts provided by users or knowledge augmentation tools.
We propose Truth-Aware Context Selection (TACS) to adaptively recognize and mask untruthful context from the inputs.
We show that TACS can effectively filter untruthful context and significantly improve the overall quality of LLMs' responses when presented with misleading information.
arXiv Detail & Related papers (2024-03-12T11:40:44Z) - Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models [84.94220787791389]
We propose Fact-and-Reflection (FaR) prompting, which improves the LLM calibration in two steps.
Experiments show that FaR achieves significantly better calibration; it lowers the Expected Error by 23.5%.
FaR even elicits the capability of verbally expressing concerns in less confident scenarios.
arXiv Detail & Related papers (2024-02-27T01:37:23Z) - Exploring Value Biases: How LLMs Deviate Towards the Ideal [57.99044181599786]
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact.
We show that value bias is strong in LLMs across different categories, similar to the results found in human studies.
arXiv Detail & Related papers (2024-02-16T18:28:43Z) - Large Language Models Help Humans Verify Truthfulness -- Except When They Are Convincingly Wrong [35.64962031447787]
Large Language Models (LLMs) are increasingly used for accessing information on the web.
Our experiments with 80 crowdworkers compare language models with search engines (information retrieval systems) at facilitating fact-checking.
Users reading LLM explanations are significantly more efficient than those using search engines while achieving similar accuracy.
arXiv Detail & Related papers (2023-10-19T08:09:58Z) - Intuitive or Dependent? Investigating LLMs' Behavior Style to
Conflicting Prompts [9.399159332152013]
This study investigates the behaviors of Large Language Models (LLMs) when faced with conflicting prompts versus their internal memory.
This will help to understand LLMs' decision mechanism and also benefit real-world applications, such as retrieval-augmented generation (RAG)
arXiv Detail & Related papers (2023-09-29T17:26:03Z) - Statistical Knowledge Assessment for Large Language Models [79.07989821512128]
Given varying prompts regarding a factoid question, can a large language model (LLM) reliably generate factually correct answers?
We propose KaRR, a statistical approach to assess factual knowledge for LLMs.
Our results reveal that the knowledge in LLMs with the same backbone architecture adheres to the scaling law, while tuning on instruction-following data sometimes compromises the model's capability to generate factually correct text reliably.
arXiv Detail & Related papers (2023-05-17T18:54:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.