Task2Dial: A Novel Task and Dataset for Commonsense enhanced Task-based
Dialogue Grounded in Documents
- URL: http://arxiv.org/abs/2204.01061v1
- Date: Sun, 3 Apr 2022 12:15:56 GMT
- Title: Task2Dial: A Novel Task and Dataset for Commonsense enhanced Task-based
Dialogue Grounded in Documents
- Authors: Carl Strathearn and Dimitra Gkatzia
- Abstract summary: This paper proposes a novel task on commonsense-enhanced task-based dialogue grounded in documents.
It describes the Task2Dial dataset, a novel dataset of document-grounded task-based dialogues.
- Score: 0.304585143845864
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a novel task on commonsense-enhanced task-based dialogue
grounded in documents and describes the Task2Dial dataset, a novel dataset of
document-grounded task-based dialogues, where an Information Giver (IG)
provides instructions (by consulting a document) to an Information Follower
(IF), so that the latter can successfully complete the task. In this unique
setting, the IF can ask clarification questions which may not be grounded in
the underlying document and require commonsense knowledge to be answered. The
Task2Dial dataset poses new challenges: (1) its human reference texts show more
lexical richness and variation than other document-grounded dialogue datasets;
(2) generating from this set requires paraphrasing as instructional responses
might have been modified from the underlying document; (3) requires commonsense
knowledge, since questions might not necessarily be grounded in the document;
(4) generating requires planning based on context, as task steps need to be
provided in order. The Task2Dial dataset contains dialogues with an average
$18.15$ number of turns and 19.79 tokens per turn, as compared to 12.94 and 12
respectively in existing datasets. As such, learning from this dataset promises
more natural, varied and less template-like system utterances.
Related papers
- The Power of Summary-Source Alignments [62.76959473193149]
Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection.
alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data.
This paper proposes extending the summary-source alignment framework by applying it at the more fine-grained proposition span level.
arXiv Detail & Related papers (2024-06-02T19:35:19Z) - CREATIVESUMM: Shared Task on Automatic Summarization for Creative
Writing [90.58269243992318]
This paper introduces the shared task of summarizing documents in several creative domains, namely literary texts, movie scripts, and television scripts.
We introduce four sub-tasks and their corresponding datasets, focusing on summarizing books, movie scripts, primetime television scripts, and daytime soap opera scripts.
As part of the CREATIVESUMM workshop at COLING 2022, the shared task attracted 18 submissions in total.
arXiv Detail & Related papers (2022-11-10T21:31:03Z) - Doc2Bot: Accessing Heterogeneous Documents via Conversational Bots [103.54897676954091]
Doc2Bot is a dataset for building machines that help users seek information via conversations.
Our dataset contains over 100,000 turns based on Chinese documents from five domains.
arXiv Detail & Related papers (2022-10-20T07:33:05Z) - FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue [70.65782786401257]
This work explores conversational task transfer by introducing FETA: a benchmark for few-sample task transfer in open-domain dialogue.
FETA contains two underlying sets of conversations upon which there are 10 and 7 tasks annotated, enabling the study of intra-dataset task transfer.
We utilize three popular language models and three learning algorithms to analyze the transferability between 132 source-target task pairs.
arXiv Detail & Related papers (2022-05-12T17:59:00Z) - KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains.
We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z) - MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents [14.807409907211452]
We propose MultiDoc2Dial, a new task and dataset on modeling goal-oriented dialogues grounded in multiple documents.
We introduce a new dataset that contains dialogues grounded in multiple documents from four different domains.
arXiv Detail & Related papers (2021-09-26T13:12:05Z) - doc2dial: A Goal-Oriented Document-Grounded Dialogue Dataset [24.040517978408484]
doc2dial is a new dataset of goal-oriented dialogues grounded in documents.
We first construct dialogue flows based on the content elements that corresponds to higher-level relations across text sections.
We present these dialogue flows to crowd contributors to create conversational utterances.
arXiv Detail & Related papers (2020-11-12T19:08:44Z) - Detecting Ongoing Events Using Contextual Word and Sentence Embeddings [110.83289076967895]
This paper introduces the Ongoing Event Detection (OED) task.
The goal is to detect ongoing event mentions only, as opposed to historical, future, hypothetical, or other forms or events that are neither fresh nor current.
Any application that needs to extract structured information about ongoing events from unstructured texts can take advantage of an OED system.
arXiv Detail & Related papers (2020-07-02T20:44:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.