Related papers: Can Users Detect Biases or Factual Errors in Generated Responses in Conversational Information-Seeking?

Can Users Detect Biases or Factual Errors in Generated Responses in Conversational Information-Seeking?

URL: http://arxiv.org/abs/2410.21529v1
Date: Mon, 28 Oct 2024 20:55:00 GMT
Title: Can Users Detect Biases or Factual Errors in Generated Responses in Conversational Information-Seeking?
Authors: Weronika Łajewska, Krisztian Balog, Damiano Spina, Johanne Trippas,
Abstract summary: We investigate the limitations of response generation in conversational information-seeking systems. The study addresses the problem of query answerability and the challenge of response incompleteness. Our analysis reveals that it is easier for users to detect response incompleteness than query answerability.
Score: 13.790574266700006
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Information-seeking dialogues span a wide range of questions, from simple factoid to complex queries that require exploring multiple facets and viewpoints. When performing exploratory searches in unfamiliar domains, users may lack background knowledge and struggle to verify the system-provided information, making them vulnerable to misinformation. We investigate the limitations of response generation in conversational information-seeking systems, highlighting potential inaccuracies, pitfalls, and biases in the responses. The study addresses the problem of query answerability and the challenge of response incompleteness. Our user studies explore how these issues impact user experience, focusing on users' ability to identify biased, incorrect, or incomplete responses. We design two crowdsourcing tasks to assess user experience with different system response variants, highlighting critical issues to be addressed in future conversational information-seeking research. Our analysis reveals that it is easier for users to detect response incompleteness than query answerability and user satisfaction is mostly associated with response diversity, not factual correctness.

Related papers

Open Domain Question Answering with Conflicting Contexts [55.739842087655774]
We find that as much as 25% of unambiguous, open domain questions can lead to conflicting contexts when retrieved using Google Search. We ask our annotators to provide explanations for their selections of correct answers.
arXiv Detail & Related papers (2024-10-16T07:24:28Z)
Grounded and Transparent Response Generation for Conversational Information-Seeking Systems [0.0]
The proposed research delves into the intricacies of response generation in CIS systems. The study focuses on generating responses grounded in the retrieved passages and being transparent about the system's limitations.
arXiv Detail & Related papers (2024-06-27T15:55:25Z)
Explainability for Transparent Conversational Information-Seeking [13.790574266700006]
This study explores different methods of explaining the responses. By exploring transparency across explanation type, quality, and presentation mode, this research aims to bridge the gap between system-generated responses and responses verifiable by the user.
arXiv Detail & Related papers (2024-05-06T09:25:14Z)
CLARINET: Augmenting Language Models to Ask Clarification Questions for Retrieval [52.134133938779776]
We present CLARINET, a system that asks informative clarification questions by choosing questions whose answers would maximize certainty in the correct candidate. Our approach works by augmenting a large language model (LLM) to condition on a retrieval distribution, finetuning end-to-end to generate the question that would have maximized the rank of the true candidate at each turn.
arXiv Detail & Related papers (2024-04-28T18:21:31Z)
PAQA: Toward ProActive Open-Retrieval Question Answering [34.883834970415734]
This work aims to tackle the challenge of generating relevant clarifying questions by taking into account the inherent ambiguities present in both user queries and documents. We propose PAQA, an extension to the existing AmbiNQ dataset, incorporating clarifying questions. We then evaluate various models and assess how passage retrieval impacts ambiguity detection and the generation of clarifying questions.
arXiv Detail & Related papers (2024-02-26T14:40:34Z)
Asking Multimodal Clarifying Questions in Mixed-Initiative Conversational Search [89.1772985740272]
In mixed-initiative conversational search systems, clarifying questions are used to help users who struggle to express their intentions in a single query. We hypothesize that in scenarios where multimodal information is pertinent, the clarification process can be improved by using non-textual information. We collect a dataset named Melon that contains over 4k multimodal clarifying questions, enriched with over 14k images. Several analyses are conducted to understand the importance of multimodal contents during the query clarification phase.
arXiv Detail & Related papers (2024-02-12T16:04:01Z)
Social Commonsense-Guided Search Query Generation for Open-Domain Knowledge-Powered Conversations [66.16863141262506]
We present a novel approach that focuses on generating internet search queries guided by social commonsense. Our proposed framework addresses passive user interactions by integrating topic tracking, commonsense response generation and instruction-driven query generation.
arXiv Detail & Related papers (2023-10-22T16:14:56Z)
ExpertQA: Expert-Curated Questions and Attributed Answers [51.68314045809179]
We conduct human evaluation of responses from a few representative systems along various axes of attribution and factuality. We collect expert-curated questions from 484 participants across 32 fields of study, and then ask the same experts to evaluate generated responses to their own questions. The output of our analysis is ExpertQA, a high-quality long-form QA dataset with 2177 questions spanning 32 fields, along with verified answers and attributions for claims in the answers.
arXiv Detail & Related papers (2023-09-14T16:54:34Z)
Continually Improving Extractive QA via Human Feedback [59.49549491725224]
We study continually improving an extractive question answering (QA) system via human user feedback. We conduct experiments involving thousands of user interactions under diverse setups to broaden the understanding of learning from feedback over time.
arXiv Detail & Related papers (2023-05-21T14:35:32Z)
Multi-stage Clarification in Conversational AI: The case of Question-Answering Dialogue Systems [0.27998963147546135]
Clarification resolution plays an important role in various information retrieval tasks such as interactive question answering and conversational search. We propose a multi-stage clarification mechanism for prompting clarification and query selection in the context of a question answering dialogue system. Our proposed mechanism improves the overall user experience and outperforms competitive baselines with two datasets.
arXiv Detail & Related papers (2021-10-28T15:45:44Z)
An Empirical Study of Clarifying Question-Based Systems [15.767515065224016]
We conduct an online experiment by deploying an experimental system, which interacts with users by asking clarifying questions against a product repository. We collect both implicit interaction behavior data and explicit feedback from users showing that: (a) users are willing to answer a good number of clarifying questions (11-21 on average), but not many more than that.
arXiv Detail & Related papers (2020-08-01T15:10:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.