Internet-Augmented Dialogue Generation
- URL: http://arxiv.org/abs/2107.07566v1
- Date: Thu, 15 Jul 2021 19:00:35 GMT
- Title: Internet-Augmented Dialogue Generation
- Authors: Mojtaba Komeili, Kurt Shuster, Jason Weston
- Abstract summary: Large language models are known to hallucinate facts when generating dialogue.
We propose an approach that learns to generate an internet search query based on the context.
We train and evaluate such models on a newly collected dataset of human-human conversations.
- Score: 31.38493489631621
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The largest store of continually updating knowledge on our planet can be
accessed via internet search. In this work we study giving access to this
information to conversational agents. Large language models, even though they
store an impressive amount of knowledge within their weights, are known to
hallucinate facts when generating dialogue (Shuster et al., 2021); moreover,
those facts are frozen in time at the point of model training. In contrast, we
propose an approach that learns to generate an internet search query based on
the context, and then conditions on the search results to finally generate a
response, a method that can employ up-to-the-minute relevant information. We
train and evaluate such models on a newly collected dataset of human-human
conversations whereby one of the speakers is given access to internet search
during knowledgedriven discussions in order to ground their responses. We find
that search-query based access of the internet in conversation provides
superior performance compared to existing approaches that either use no
augmentation or FAISS-based retrieval (Lewis et al., 2020).
Related papers
- A Survey of Conversational Search [44.09402706387407]
We explore the recent advancements and potential future directions in conversational search.
We highlight the integration of large language models (LLMs) in enhancing these systems.
We provide insights into real-world applications and robust evaluations of current conversational search systems.
arXiv Detail & Related papers (2024-10-21T01:54:46Z) - Data Augmentation for Conversational AI [17.48107304359591]
Data augmentation (DA) is an affective approach to alleviate the data scarcity problem in conversational systems.
This tutorial provides a comprehensive and up-to-date overview of DA approaches in the context of conversational systems.
arXiv Detail & Related papers (2023-09-09T09:56:35Z) - AutoConv: Automatically Generating Information-seeking Conversations
with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation.
Specifically, we formulate the conversation generation problem as a language modeling task.
We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z) - Does Collaborative Human-LM Dialogue Generation Help Information
Extraction from Human Dialogues? [55.28340832822234]
Problem-solving human dialogues in real applications can be much more complex than existing Wizard-of-Oz collections.
We introduce a human-in-the-loop dialogue generation framework capable of synthesizing realistic dialogues.
arXiv Detail & Related papers (2023-07-13T20:02:50Z) - q2d: Turning Questions into Dialogs to Teach Models How to Search [11.421839177607147]
We propose q2d: an automatic data generation pipeline that generates information-seeking dialogs from questions.
Unlike previous approaches which relied on human written dialogs with search queries, our method allows to automatically generate query-based grounded dialogs with better control and scale.
arXiv Detail & Related papers (2023-04-27T16:39:15Z) - Search-Engine-augmented Dialogue Response Generation with Cheaply
Supervised Query Production [98.98161995555485]
We propose a dialogue model that can access the vast and dynamic information from any search engine for response generation.
As the core module, a query producer is used to generate queries from a dialogue context to interact with a search engine.
Experiments show that our query producer can achieve R@1 and R@5 rates of 62.4% and 74.8% for retrieving gold knowledge.
arXiv Detail & Related papers (2023-02-16T01:58:10Z) - Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning [35.67318830455459]
We develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversational skill at scale.
Our work pairs the succinct embedding of the conversation state generated using SOTA (supervised) language models with RL techniques that are particularly suited to a dynamic action space.
arXiv Detail & Related papers (2022-07-25T16:12:33Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI.
The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL)
We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z) - BERT Embeddings Can Track Context in Conversational Search [5.3222282321717955]
We develop a conversational search system that helps people search for information in a natural way.
System is able to understand the context where the question is posed, tracking the current state of the conversation and detecting mentions to previous questions and answers.
arXiv Detail & Related papers (2021-04-13T22:02:24Z) - Conversations with Search Engines: SERP-based Conversational Response
Generation [77.1381159789032]
We create a suitable dataset, the Search as a Conversation (SaaC) dataset, for the development of pipelines for conversations with search engines.
We also develop a state-of-the-art pipeline for conversations with search engines, the Conversations with Search Engines (CaSE) using this dataset.
CaSE enhances the state-of-the-art by introducing a supporting token identification module and aprior-aware pointer generator.
arXiv Detail & Related papers (2020-04-29T13:07:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.