Selectively Answering Ambiguous Questions
- URL: http://arxiv.org/abs/2305.14613v2
- Date: Wed, 15 Nov 2023 02:15:02 GMT
- Title: Selectively Answering Ambiguous Questions
- Authors: Jeremy R. Cole, Michael J.Q. Zhang, Daniel Gillick, Julian Martin
Eisenschlos, Bhuwan Dhingra, and Jacob Eisenstein
- Abstract summary: We find that the most reliable approach to decide when to abstain involves quantifying repetition within sampled model outputs.
Our results suggest that sampling-based confidence scores help calibrate answers to relatively unambiguous questions.
- Score: 38.83930394700588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Trustworthy language models should abstain from answering questions when they
do not know the answer. However, the answer to a question can be unknown for a
variety of reasons. Prior research has focused on the case in which the
question is clear and the answer is unambiguous but possibly unknown, but the
answer to a question can also be unclear due to uncertainty of the questioner's
intent or context. We investigate question answering from this perspective,
focusing on answering a subset of questions with a high degree of accuracy,
from a set of questions in which many are inherently ambiguous. In this
setting, we find that the most reliable approach to decide when to abstain
involves quantifying repetition within sampled model outputs, rather than the
model's likelihood or self-verification as used in prior work. We find this to
be the case across different types of uncertainty and model scales,and with or
without instruction tuning. Our results suggest that sampling-based confidence
scores help calibrate answers to relatively unambiguous questions, with more
dramatic improvements on ambiguous questions.
Related papers
- Don't Just Say "I don't know"! Self-aligning Large Language Models for Responding to Unknown Questions with Explanations [70.6395572287422]
Self-alignment method is capable of not only refusing to answer but also providing explanation to the unanswerability of unknown questions.
We conduct disparity-driven self-curation to select qualified data for fine-tuning the LLM itself for aligning the responses to unknown questions as desired.
arXiv Detail & Related papers (2024-02-23T02:24:36Z) - Answering Ambiguous Questions with a Database of Questions, Answers, and
Revisions [95.92276099234344]
We present a new state-of-the-art for answering ambiguous questions that exploits a database of unambiguous questions generated from Wikipedia.
Our method improves performance by 15% on recall measures and 10% on measures which evaluate disambiguating questions from predicted outputs.
arXiv Detail & Related papers (2023-08-16T20:23:16Z) - Answering Ambiguous Questions via Iterative Prompting [84.3426020642704]
In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist.
One approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity.
We present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions.
arXiv Detail & Related papers (2023-07-08T04:32:17Z) - CLAM: Selective Clarification for Ambiguous Questions with Large
Language Models [37.37606905433334]
We show that current SotA models do not ask the user for clarification when presented with imprecise questions.
We introduce CLAM, a framework that first uses the model to detect ambiguous questions and if an ambiguous question is detected, prompts the model to ask the user for clarification.
We show that our method achieves a 20.15 percentage point accuracy improvement over SotA on a novel ambiguous question-answering answering data set.
arXiv Detail & Related papers (2022-12-15T12:47:18Z) - CREPE: Open-Domain Question Answering with False Presuppositions [92.20501870319765]
We introduce CREPE, a QA dataset containing a natural distribution of presupposition failures from online information-seeking forums.
We find that 25% of questions contain false presuppositions, and provide annotations for these presuppositions and their corrections.
We show that adaptations of existing open-domain QA models can find presuppositions moderately well, but struggle when predicting whether a presupposition is factually correct.
arXiv Detail & Related papers (2022-11-30T18:54:49Z) - Answering Ambiguous Questions through Generative Evidence Fusion and
Round-Trip Prediction [46.38201136570501]
We present a model that aggregates and combines evidence from multiple passages to adaptively predict a single answer or a set of question-answer pairs for ambiguous questions.
Our model, named Refuel, achieves a new state-of-the-art performance on the AmbigQA dataset, and shows competitive performance on NQ-Open and TriviaQA.
arXiv Detail & Related papers (2020-11-26T05:48:55Z) - A Wrong Answer or a Wrong Question? An Intricate Relationship between
Question Reformulation and Answer Selection in Conversational Question
Answering [15.355557454305776]
We show that question rewriting (QR) of the conversational context allows to shed more light on this phenomenon.
We present the results of this analysis on the TREC CAsT and QuAC (CANARD) datasets.
arXiv Detail & Related papers (2020-10-13T06:29:51Z) - Match$^2$: A Matching over Matching Model for Similar Question
Identification [74.7142127303489]
Community Question Answering (CQA) has become a primary means for people to acquire knowledge, where people are free to ask questions or submit answers.
Similar question identification becomes a core task in CQA which aims to find a similar question from the archived repository whenever a new question is asked.
It has long been a challenge to properly measure the similarity between two questions due to the inherent variation of natural language, i.e., there could be different ways to ask a same question or different questions sharing similar expressions.
Traditional methods typically take a one-side usage, which leverages the answer as some expanded representation of the
arXiv Detail & Related papers (2020-06-21T05:59:34Z) - Rephrasing visual questions by specifying the entropy of the answer
distribution [0.0]
We propose a novel task, rephrasing the questions by controlling the ambiguity of the questions.
The ambiguity of a visual question is defined by the use of the entropy of the answer distribution predicted by a VQA model.
We demonstrate the advantage of our approach that can control the ambiguity of the rephrased questions, and an interesting observation that it is harder to increase than to reduce ambiguity.
arXiv Detail & Related papers (2020-04-10T09:32:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.