Can AI Assistants Know What They Don't Know?
- URL: http://arxiv.org/abs/2401.13275v2
- Date: Sun, 28 Jan 2024 09:07:13 GMT
- Title: Can AI Assistants Know What They Don't Know?
- Authors: Qinyuan Cheng and Tianxiang Sun and Xiangyang Liu and Wenwei Zhang and
Zhangyue Yin and Shimin Li and Linyang Li and Zhengfu He and Kai Chen and
Xipeng Qiu
- Abstract summary: An AI assistant's refusal to answer questions it does not know is a crucial method for reducing hallucinations and making the assistant truthful.
We construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions.
After alignment with Idk datasets, the assistant can refuse to answer most its unknown questions.
- Score: 79.6178700946602
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, AI assistants based on large language models (LLMs) show surprising
performance in many tasks, such as dialogue, solving math problems, writing
code, and using tools. Although LLMs possess intensive world knowledge, they
still make factual errors when facing some knowledge intensive tasks, like
open-domain question answering. These untruthful responses from the AI
assistant may cause significant risks in practical applications. We believe
that an AI assistant's refusal to answer questions it does not know is a
crucial method for reducing hallucinations and making the assistant truthful.
Therefore, in this paper, we ask the question "Can AI assistants know what they
don't know and express them through natural language?" To answer this question,
we construct a model-specific "I don't know" (Idk) dataset for an assistant,
which contains its known and unknown questions, based on existing open-domain
question answering datasets. Then we align the assistant with its corresponding
Idk dataset and observe whether it can refuse to answer its unknown questions
after alignment. Experimental results show that after alignment with Idk
datasets, the assistant can refuse to answer most its unknown questions. For
questions they attempt to answer, the accuracy is significantly higher than
before the alignment.
Related papers
- Don't Just Say "I don't know"! Self-aligning Large Language Models for Responding to Unknown Questions with Explanations [70.6395572287422]
Self-alignment method is capable of not only refusing to answer but also providing explanation to the unanswerability of unknown questions.
We conduct disparity-driven self-curation to select qualified data for fine-tuning the LLM itself for aligning the responses to unknown questions as desired.
arXiv Detail & Related papers (2024-02-23T02:24:36Z) - A Comparative and Experimental Study on Automatic Question Answering
Systems and its Robustness against Word Jumbling [0.49157446832511503]
Question answer generation is highly relevant because a frequently asked questions (FAQ) list can only have a finite amount of questions.
A model which can perform question answer generation could be able to answer completely new questions that are within the scope of the data.
In commercial applications, it can be used to increase customer satisfaction and ease of usage.
However a lot of data is generated by humans so it is susceptible to human error and this can adversely affect the model's performance.
arXiv Detail & Related papers (2023-11-27T03:17:09Z) - Self-Knowledge Guided Retrieval Augmentation for Large Language Models [59.771098292611846]
Large language models (LLMs) have shown superior performance without task-specific fine-tuning.
Retrieval-based methods can offer non-parametric world knowledge and improve the performance on tasks such as question answering.
Self-Knowledge guided Retrieval augmentation (SKR) is a simple yet effective method which can let LLMs refer to the questions they have previously encountered.
arXiv Detail & Related papers (2023-10-08T04:22:33Z) - Collaboration with Conversational AI Assistants for UX Evaluation:
Questions and How to Ask them (Voice vs. Text) [18.884080068561843]
We conducted a Wizard-of-Oz design probe study with 20 participants who interacted with simulated AI assistants via text or voice.
We found that participants asked for five categories of information: user actions, user mental model, help from the AI assistant, product and task information, and user demographics.
The text assistant was perceived as significantly more efficient, but both were rated equally in satisfaction and trust.
arXiv Detail & Related papers (2023-03-07T03:59:14Z) - Asking for Knowledge: Training RL Agents to Query External Knowledge
Using Language [121.56329458876655]
We introduce two new environments: the grid-world-based Q-BabyAI and the text-based Q-TextWorld.
We propose the "Asking for Knowledge" (AFK) agent, which learns to generate language commands to query for meaningful knowledge.
arXiv Detail & Related papers (2022-05-12T14:20:31Z) - Open-domain clarification question generation without question examples [4.34222556313791]
We propose a framework for building a question-asking model capable of producing polar (yes-no) clarification questions.
Our model uses an expected information gain objective to derive informative questions from an off-the-shelf image captioner.
We demonstrate our model's ability to pose questions that improve communicative success in a goal-oriented 20 questions game with synthetic and human answerers.
arXiv Detail & Related papers (2021-10-19T07:51:54Z) - A Dataset of Information-Seeking Questions and Answers Anchored in
Research Papers [66.11048565324468]
We present a dataset of 5,049 questions over 1,585 Natural Language Processing papers.
Each question is written by an NLP practitioner who read only the title and abstract of the corresponding paper, and the question seeks information present in the full text.
We find that existing models that do well on other QA tasks do not perform well on answering these questions, underperforming humans by at least 27 F1 points when answering them from entire papers.
arXiv Detail & Related papers (2021-05-07T00:12:34Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.