Related papers: CookDial: A dataset for task-oriented dialogs grounded in procedural documents

CookDial: A dataset for task-oriented dialogs grounded in procedural documents

URL: http://arxiv.org/abs/2206.08723v1
Date: Fri, 17 Jun 2022 12:23:53 GMT
Title: CookDial: A dataset for task-oriented dialogs grounded in procedural documents
Authors: Yiwei Jiang, Klim Zaporojets, Johannes Deleu, Thomas Demeester, Chris Develder
Abstract summary: This work presents a new dialog dataset, CookDial, that facilitates research on task-oriented dialog systems with procedural knowledge understanding. The corpus contains 260 human-to-human task-oriented dialogs in which an agent, given a recipe document, guides the user to cook a dish. Dialogs in CookDial exhibit two unique features: (i) procedural alignment between the dialog flow and supporting document; (ii) complex agent decision-making that involves segmenting long sentences, paraphrasing hard instructions and resolving coreference in the dialog context.
Score: 21.431615439267734
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This work presents a new dialog dataset, CookDial, that facilitates research on task-oriented dialog systems with procedural knowledge understanding. The corpus contains 260 human-to-human task-oriented dialogs in which an agent, given a recipe document, guides the user to cook a dish. Dialogs in CookDial exhibit two unique features: (i) procedural alignment between the dialog flow and supporting document; (ii) complex agent decision-making that involves segmenting long sentences, paraphrasing hard instructions and resolving coreference in the dialog context. In addition, we identify three challenging (sub)tasks in the assumed task-oriented dialog system: (1) User Question Understanding, (2) Agent Action Frame Prediction, and (3) Agent Response Generation. For each of these tasks, we develop a neural baseline model, which we evaluate on the CookDial dataset. We publicly release the CookDial dataset, comprising rich annotations of both dialogs and recipe documents, to stimulate further research on domain-specific document-grounded dialog systems.

Related papers

On Mitigating Data Sparsity in Conversational Recommender Systems [69.70761335240738]
Conversational recommender systems (CRSs) capture user preference through textual information in dialogues.<n>They suffer from data sparsity on two fronts: the dialogue space is vast and linguistically diverse, while the item space exhibits long-tail and sparse distributions.<n>Existing methods struggle with (1) generalizing to varied dialogue expressions due to underutilization of rich textual cues, and (2) learning informative item representations under severe sparsity.
arXiv Detail & Related papers (2025-07-01T06:54:51Z)
Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users [51.34484827552774]
We release the Multi-User MultiWOZ dataset: task-oriented dialogues among two users and one agent. These dialogues reflect interesting dynamics of collaborative decision-making in task-oriented scenarios. We propose a novel task of multi-user contextual query rewriting: to rewrite a task-oriented chat between two users as a concise task-oriented query.
arXiv Detail & Related papers (2023-10-31T14:12:07Z)
Leveraging Explicit Procedural Instructions for Data-Efficient Action Prediction [5.448684866061922]
Task-oriented dialogues often require agents to enact complex, multi-step procedures in order to meet user requests. Large language models have found success automating these dialogues in constrained environments, but their widespread deployment is limited by the substantial quantities of task-specific data required for training. This paper presents a data-efficient solution to constructing dialogue systems, leveraging explicit instructions derived from agent guidelines.
arXiv Detail & Related papers (2023-06-06T18:42:08Z)
Doc2Bot: Accessing Heterogeneous Documents via Conversational Bots [103.54897676954091]
Doc2Bot is a dataset for building machines that help users seek information via conversations. Our dataset contains over 100,000 turns based on Chinese documents from five domains.
arXiv Detail & Related papers (2022-10-20T07:33:05Z)
Dialog Acts for Task-Driven Embodied Agents [10.275619475149433]
Embodied agents need to be able to interact in natural language understanding task descriptions and asking appropriate follow up questions. We propose a set of dialog acts for modelling such dialogs and annotate the TEACh dataset that includes over 3,000 situated, task oriented conversations. We demonstrate the use of this annotated dataset in training models for tagging the dialog acts of a given utterance, predicting the dialog act of the next response given a dialog history, and use the dialog acts to guide agent's non-dialog behaviour.
arXiv Detail & Related papers (2022-09-26T18:41:28Z)
Manual-Guided Dialogue for Flexible Conversational Agents [84.46598430403886]
How to build and use dialogue data efficiently, and how to deploy models in different domains at scale can be critical issues in building a task-oriented dialogue system. We propose a novel manual-guided dialogue scheme, where the agent learns the tasks from both dialogue and manuals. Our proposed scheme reduces the dependence of dialogue models on fine-grained domain ontology, and makes them more flexible to adapt to various domains.
arXiv Detail & Related papers (2022-08-16T08:21:12Z)
KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains. We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z)
HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data [87.67278915655712]
We present a new dialogue dataset, HybriDialogue, which consists of crowdsourced natural conversations grounded on both Wikipedia text and tables. The conversations are created through the decomposition of complex multihop questions into simple, realistic multiturn dialogue interactions.
arXiv Detail & Related papers (2022-04-28T00:52:16Z)
A Unified Pre-training Framework for Conversational AI [25.514505462661763]
PLATO-2 is trained via two-stage curriculum learning to fit the simplified one-to-one mapping relationship. PLATO-2 has obtained the 1st place in all three tasks, verifying its effectiveness as a unified framework for various dialogue systems.
arXiv Detail & Related papers (2021-05-06T07:27:11Z)
Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension [49.92173751203827]
In multi-turn dialog, utterances do not always take the full form of sentences. We propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question.
arXiv Detail & Related papers (2020-12-14T10:58:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.