Overinformative Question Answering by Humans and Machines
- URL: http://arxiv.org/abs/2305.07151v1
- Date: Thu, 11 May 2023 21:41:41 GMT
- Title: Overinformative Question Answering by Humans and Machines
- Authors: Polina Tsvilodub, Michael Franke, Robert D. Hawkins, Noah D. Goodman
- Abstract summary: We show that overinformativeness in human answering is driven by considerations of relevance to the questioner's goals.
We show that GPT-3 is highly sensitive to the form of the prompt and only human-like answer patterns when guided by an example and cognitively-motivated explanation.
- Score: 26.31070412632125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When faced with a polar question, speakers often provide overinformative
answers going beyond a simple "yes" or "no". But what principles guide the
selection of additional information? In this paper, we provide experimental
evidence from two studies suggesting that overinformativeness in human
answering is driven by considerations of relevance to the questioner's goals
which they flexibly adjust given the functional context in which the question
is uttered. We take these human results as a strong benchmark for investigating
question-answering performance in state-of-the-art neural language models,
conducting an extensive evaluation on items from human experiments. We find
that most models fail to adjust their answering behavior in a human-like way
and tend to include irrelevant information. We show that GPT-3 is highly
sensitive to the form of the prompt and only achieves human-like answer
patterns when guided by an example and cognitively-motivated explanation.
Related papers
- Analyzing Human Questioning Behavior and Causal Curiosity through Natural Queries [91.70689724416698]
We present NatQuest, a collection of 13,500 naturally occurring questions from three diverse sources.
Our analysis reveals a significant presence of causal questions (up to 42%) within the dataset.
arXiv Detail & Related papers (2024-05-30T17:55:28Z) - Don't Just Say "I don't know"! Self-aligning Large Language Models for Responding to Unknown Questions with Explanations [70.6395572287422]
Self-alignment method is capable of not only refusing to answer but also providing explanation to the unanswerability of unknown questions.
We conduct disparity-driven self-curation to select qualified data for fine-tuning the LLM itself for aligning the responses to unknown questions as desired.
arXiv Detail & Related papers (2024-02-23T02:24:36Z) - What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception [53.4840989321394]
We analyze the effect of rationales generated by QA models to support their answers.
We present users with incorrect answers and corresponding rationales in various formats.
We measure the effectiveness of this feedback in patching these rationales through in-context learning.
arXiv Detail & Related papers (2023-11-16T04:26:32Z) - FOLLOWUPQG: Towards Information-Seeking Follow-up Question Generation [38.78216651059955]
We introduce the task of real-world information-seeking follow-up question generation (FQG)
We construct FOLLOWUPQG, a dataset of over 3K real-world (initial question, answer, follow-up question)s collected from a forum layman providing Reddit-friendly explanations for open-ended questions.
In contrast to existing datasets, questions in FOLLOWUPQG use more diverse pragmatic strategies to seek information, and they also show higher-order cognitive skills.
arXiv Detail & Related papers (2023-09-10T11:58:29Z) - Connecting Humanities and Social Sciences: Applying Language and Speech
Technology to Online Panel Surveys [2.0646127669654835]
We explore the application of language and speech technology to open-ended questions in a Dutch panel survey.
In an experimental wave respondents could choose to answer open questions via speech or keyboard.
We report the errors the ASR system produces and investigate the impact of these errors on downstream analyses.
arXiv Detail & Related papers (2023-02-21T10:52:15Z) - Single-Turn Debate Does Not Help Humans Answer Hard
Reading-Comprehension Questions [29.932543276414602]
We build a dataset of single arguments for both a correct and incorrect answer option in a debate-style set-up.
We use long contexts -- humans familiar with the context write convincing explanations for pre-selected correct and incorrect answers.
We test if those explanations allow humans who have not read the full context to more accurately determine the correct answer.
arXiv Detail & Related papers (2022-04-11T15:56:34Z) - How Do We Answer Complex Questions: Discourse Structure of Long-form
Answers [51.973363804064704]
We study the functional structure of long-form answers collected from three datasets.
Our main goal is to understand how humans organize information to craft complex answers.
Our work can inspire future research on discourse-level modeling and evaluation of long-form QA systems.
arXiv Detail & Related papers (2022-03-21T15:14:10Z) - WebGPT: Browser-assisted question-answering with human feedback [12.865185980752733]
We fine-tune GPT-3 to answer long-form questions using a text-based web-browsing environment.
To make human evaluation of factual accuracy easier, models must collect references while browsing in support of their answers.
This model's answers are preferred by humans 56% of the time to those of our human demonstrators, and 69% of the time to the highest-voted answer from Reddit.
arXiv Detail & Related papers (2021-12-17T05:43:43Z) - Open-domain clarification question generation without question examples [4.34222556313791]
We propose a framework for building a question-asking model capable of producing polar (yes-no) clarification questions.
Our model uses an expected information gain objective to derive informative questions from an off-the-shelf image captioner.
We demonstrate our model's ability to pose questions that improve communicative success in a goal-oriented 20 questions game with synthetic and human answerers.
arXiv Detail & Related papers (2021-10-19T07:51:54Z) - MixQG: Neural Question Generation with Mixed Answer Types [54.23205265351248]
We propose a neural question generator, MixQG, to bridge this gap.
We combine 9 question answering datasets with diverse answer types, including yes/no, multiple-choice, extractive, and abstractive answers.
Our model outperforms existing work in both seen and unseen domains.
arXiv Detail & Related papers (2021-10-15T16:03:40Z) - Prompting Contrastive Explanations for Commonsense Reasoning Tasks [74.7346558082693]
Large pretrained language models (PLMs) can achieve near-human performance on commonsense reasoning tasks.
We show how to use these same models to generate human-interpretable evidence.
arXiv Detail & Related papers (2021-06-12T17:06:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.