Information Seeking in the Spirit of Learning: a Dataset for
Conversational Curiosity
- URL: http://arxiv.org/abs/2005.00172v2
- Date: Tue, 10 Nov 2020 02:09:50 GMT
- Title: Information Seeking in the Spirit of Learning: a Dataset for
Conversational Curiosity
- Authors: Pedro Rodriguez, Paul Crook, Seungwhan Moon, Zhiguang Wang
- Abstract summary: We design a Wizard-of-Oz dialog task that tests the hypothesis that engagement increases when users are presented with facts related to what they know.
We collect and release 14K dialogs (181K utterances) where users and assistants converse about geographic topics.
This dataset is annotated with pre-existing user knowledge, message-level dialog acts, grounding to Wikipedia, and user reactions to messages.
- Score: 10.409312809724458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open-ended human learning and information-seeking are increasingly mediated
by digital assistants. However, such systems often ignore the user's
pre-existing knowledge. Assuming a correlation between engagement and user
responses such as "liking" messages or asking followup questions, we design a
Wizard-of-Oz dialog task that tests the hypothesis that engagement increases
when users are presented with facts related to what they know. Through
crowd-sourcing of this experiment, we collect and release 14K dialogs (181K
utterances) where users and assistants converse about geographic topics like
geopolitical entities and locations. This dataset is annotated with
pre-existing user knowledge, message-level dialog acts, grounding to Wikipedia,
and user reactions to messages. Responses using a user's prior knowledge
increase engagement. We incorporate this knowledge into a multi-task model that
reproduces human assistant policies and improves over a BERT content model by
13 mean reciprocal rank points.
Related papers
- Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations [8.848859080368799]
Collaborative STORM lets users observe and steer the discourse among several LM agents.
The agents ask questions on the user's behalf, allowing the user to discover unknown unknowns serendipitously.
For automatic evaluation, we construct the WildSeek dataset by collecting real information-seeking records with user goals.
arXiv Detail & Related papers (2024-08-27T17:50:03Z) - NewsDialogues: Towards Proactive News Grounded Conversation [72.10055780635625]
We propose a novel task, Proactive News Grounded Conversation, in which a dialogue system can proactively lead the conversation based on some key topics of the news.
To further develop this novel task, we collect a human-to-human Chinese dialogue dataset tsNewsDialogues, which includes 1K conversations with a total of 14.6K utterances.
arXiv Detail & Related papers (2023-08-12T08:33:42Z) - Let's Get Personal: Personal Questions Improve SocialBot Performance in
the Alexa Prize [0.0]
There has been an increased focus on creating conversational open-domain dialogue systems in the spoken dialogue community.
Unlike traditional dialogue systems, these conversational systems cannot assume any specific information need or domain restrictions.
We developed a robust open-domain conversational system, Athena, that real Amazon Echo users access and evaluate at scale.
arXiv Detail & Related papers (2023-03-09T00:10:29Z) - KPT: Keyword-guided Pre-training for Grounded Dialog Generation [82.68787152707455]
We propose KPT (guided Pre-Training), a novel self-supervised pre-training method for grounded dialog generation.
Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords.
We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages.
arXiv Detail & Related papers (2022-12-04T04:05:01Z) - Knowledge-Grounded Conversational Data Augmentation with Generative
Conversational Networks [76.11480953550013]
We take a step towards automatically generating conversational data using Generative Conversational Networks.
We evaluate our approach on conversations with and without knowledge on the Topical Chat dataset.
arXiv Detail & Related papers (2022-07-22T22:37:14Z) - Learning as Conversation: Dialogue Systems Reinforced for Information
Acquisition [30.91417206129677]
We propose novel AI-empowered chat bots for learning as conversation where a user does not read a passage but gains information and knowledge through conversation with a teacher bot.
Our information-acquisition-oriented dialogue system employs a novel adaptation of reinforced self-play so that the system can be transferred to various domains without in-domain dialogue data.
arXiv Detail & Related papers (2022-05-29T19:42:25Z) - KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains.
We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z) - There Are a Thousand Hamlets in a Thousand People's Eyes: Enhancing
Knowledge-grounded Dialogue with Personal Memory [67.24942840683904]
We introduce personal memory into knowledge selection in Knowledge-grounded conversation.
We devise a learning scheme in which the forward mapping from personal memory to knowledge and its inverse mapping is included in a closed loop.
Experiment results show that our method outperforms existing KGC methods significantly on both automatic evaluation and human evaluation.
arXiv Detail & Related papers (2022-04-06T07:06:37Z) - Multi-Sentence Knowledge Selection in Open-Domain Dialogue [11.936691632841388]
We evaluate the existing state of open-domain conversation knowledge selection.
We create an augmented dataset based on the Wizard of Wikipedia (WOW) corpus.
WOW++ averages 8 relevant knowledge sentences per dialogue context.
arXiv Detail & Related papers (2022-03-01T22:07:05Z) - Call for Customized Conversation: Customized Conversation Grounding
Persona and Knowledge [25.378474996192438]
We introduce a call For Customized conversation dataset where the customized answers are built with the user's persona and Wikipedia knowledge.
We evaluate the abilities to make informative and customized utterances of pre-trained language models.
arXiv Detail & Related papers (2021-12-16T04:44:27Z) - IART: Intent-aware Response Ranking with Transformers in
Information-seeking Conversation Systems [80.0781718687327]
We analyze user intent patterns in information-seeking conversations and propose an intent-aware neural response ranking model "IART"
IART is built on top of the integration of user intent modeling and language representation learning with the Transformer architecture.
arXiv Detail & Related papers (2020-02-03T05:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.