Gotcha! Don't trick me with unanswerable questions! Self-aligning Large
Language Models for Responding to Unknown Questions
- URL: http://arxiv.org/abs/2402.15062v1
- Date: Fri, 23 Feb 2024 02:24:36 GMT
- Title: Gotcha! Don't trick me with unanswerable questions! Self-aligning Large
Language Models for Responding to Unknown Questions
- Authors: Yang Deng, Yong Zhao, Moxin Li, See-Kiong Ng, Tat-Seng Chua
- Abstract summary: Self-alignment method is capable of not only refusing to answer but also providing explanation to the unanswerability of unknown questions.
We conduct disparity-driven self-curation to select qualified data for fine-tuning the LLM itself for aligning the responses to unknown questions as desired.
- Score: 75.78536317322616
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the remarkable abilities of Large Language Models (LLMs) to answer
questions, they often display a considerable level of overconfidence even when
the question does not have a definitive answer. To avoid providing hallucinated
answers to these unknown questions, existing studies typically investigate
approaches to refusing to answer these questions. In this work, we propose a
novel and scalable self-alignment method to utilize the LLM itself to enhance
its response-ability to different types of unknown questions, being capable of
not only refusing to answer but also providing explanation to the
unanswerability of unknown questions. Specifically, the Self-Align method first
employ a two-stage class-aware self-augmentation approach to generate a large
amount of unknown question-response data. Then we conduct disparity-driven
self-curation to select qualified data for fine-tuning the LLM itself for
aligning the responses to unknown questions as desired. Experimental results on
two datasets across four types of unknown questions validate the superiority of
the Self-Align method over existing baselines in terms of three types of task
formulation.
Related papers
- I Could've Asked That: Reformulating Unanswerable Questions [89.93173151422636]
We evaluate open-source and proprietary models for reformulating unanswerable questions.
GPT-4 and Llama2-7B successfully reformulate questions only 26% and 12% of the time, respectively.
We publicly release the benchmark and the code to reproduce the experiments.
arXiv Detail & Related papers (2024-07-24T17:59:07Z) - Which questions should I answer? Salience Prediction of Inquisitive Questions [118.097974193544]
We show that highly salient questions are empirically more likely to be answered in the same article.
We further validate our findings by showing that answering salient questions is an indicator of summarization quality in news.
arXiv Detail & Related papers (2024-04-16T21:33:05Z) - Improving Zero-shot Visual Question Answering via Large Language Models
with Reasoning Question Prompts [22.669502403623166]
We present Reasoning Question Prompts for VQA tasks, which can further activate the potential of Large Language Models.
We generate self-contained questions as reasoning question prompts via an unsupervised question edition module.
Each reasoning question prompt clearly indicates the intent of the original question.
Then, the candidate answers associated with their confidence scores acting as answer integritys are fed into LLMs.
arXiv Detail & Related papers (2023-11-15T15:40:46Z) - Can NLP Models 'Identify', 'Distinguish', and 'Justify' Questions that
Don't have a Definitive Answer? [43.03399918557937]
In real-world applications, users often ask questions that don't have a definitive answer.
We introduce QnotA, a dataset consisting of five different categories of questions that don't have definitive answers.
With this data, we formulate three evaluation tasks that test a system's ability to 'identify', 'distinguish', and 'justify' QnotA questions.
We show that even SOTA models including GPT-3 and Flan T5 do not fare well on these tasks and lack considerably behind the human performance baseline.
arXiv Detail & Related papers (2023-09-08T23:12:03Z) - Answering Ambiguous Questions with a Database of Questions, Answers, and
Revisions [95.92276099234344]
We present a new state-of-the-art for answering ambiguous questions that exploits a database of unambiguous questions generated from Wikipedia.
Our method improves performance by 15% on recall measures and 10% on measures which evaluate disambiguating questions from predicted outputs.
arXiv Detail & Related papers (2023-08-16T20:23:16Z) - Selectively Answering Ambiguous Questions [38.83930394700588]
We find that the most reliable approach to decide when to abstain involves quantifying repetition within sampled model outputs.
Our results suggest that sampling-based confidence scores help calibrate answers to relatively unambiguous questions.
arXiv Detail & Related papers (2023-05-24T01:25:38Z) - CREPE: Open-Domain Question Answering with False Presuppositions [92.20501870319765]
We introduce CREPE, a QA dataset containing a natural distribution of presupposition failures from online information-seeking forums.
We find that 25% of questions contain false presuppositions, and provide annotations for these presuppositions and their corrections.
We show that adaptations of existing open-domain QA models can find presuppositions moderately well, but struggle when predicting whether a presupposition is factually correct.
arXiv Detail & Related papers (2022-11-30T18:54:49Z) - Generating Answer Candidates for Quizzes and Answer-Aware Question
Generators [16.44011627249311]
We propose a model that can generate a specified number of answer candidates for a given passage of text.
Our experiments show that our proposed answer candidate generation model outperforms several baselines.
arXiv Detail & Related papers (2021-08-29T19:33:51Z) - MS-Ranker: Accumulating Evidence from Potentially Correct Candidates for
Answer Selection [59.95429407899612]
We propose a novel reinforcement learning based multi-step ranking model, named MS-Ranker.
We explicitly consider the potential correctness of candidates and update the evidence with a gating mechanism.
Our model significantly outperforms existing methods that do not rely on external resources.
arXiv Detail & Related papers (2020-10-10T10:36:58Z) - Stay Hungry, Stay Focused: Generating Informative and Specific Questions
in Information-Seeking Conversations [41.74162467619795]
We investigate the problem of generating informative questions in information-asymmetric conversations.
To generate pragmatic questions, we use reinforcement learning to optimize an informativeness metric.
We demonstrate that the resulting pragmatic questioner substantially improves the informativeness and specificity of questions generated over a baseline model.
arXiv Detail & Related papers (2020-04-30T00:49:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.