HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on
Tabular and Textual Data
- URL: http://arxiv.org/abs/2204.13243v1
- Date: Thu, 28 Apr 2022 00:52:16 GMT
- Title: HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on
Tabular and Textual Data
- Authors: Kai Nakamura, Sharon Levy, Yi-Lin Tuan, Wenhu Chen, William Yang Wang
- Abstract summary: We present a new dialogue dataset, HybriDialogue, which consists of crowdsourced natural conversations grounded on both Wikipedia text and tables.
The conversations are created through the decomposition of complex multihop questions into simple, realistic multiturn dialogue interactions.
- Score: 87.67278915655712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A pressing challenge in current dialogue systems is to successfully converse
with users on topics with information distributed across different modalities.
Previous work in multiturn dialogue systems has primarily focused on either
text or table information. In more realistic scenarios, having a joint
understanding of both is critical as knowledge is typically distributed over
both unstructured and structured forms. We present a new dialogue dataset,
HybriDialogue, which consists of crowdsourced natural conversations grounded on
both Wikipedia text and tables. The conversations are created through the
decomposition of complex multihop questions into simple, realistic multiturn
dialogue interactions. We propose retrieval, system state tracking, and
dialogue response generation tasks for our dataset and conduct baseline
experiments for each. Our results show that there is still ample opportunity
for improvement, demonstrating the importance of building stronger dialogue
systems that can reason over the complex setting of information-seeking
dialogue grounded on tables and text.
Related papers
- Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs [4.449835214520727]
We study the potential of large language models for conversational grounding.
Our approach involves annotating human conversations across five knowledge domains to create a new dialogue corpus called BridgeKG.
Our findings offer insights into how these models use in-context learning for conversational grounding tasks and common prediction errors.
arXiv Detail & Related papers (2024-08-02T08:07:15Z) - MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging
Knowledge Graphs [15.876075659237722]
Multi-Passage to Dialogue (MP2D) generates question-answering datasets with natural topic transitions.
MP2D maps the flow of topics within a dialogue, effectively mirroring the dynamics of human conversation.
This study introduces a novel benchmark for topic shift dialogues, TS-WikiDialog.
arXiv Detail & Related papers (2024-03-09T06:28:48Z) - Are cascade dialogue state tracking models speaking out of turn in
spoken dialogues? [1.786898113631979]
This paper proposes a comprehensive analysis of the errors of state of the art systems in complex settings such as Dialogue State Tracking.
Based on spoken MultiWoz, we identify that errors on non-categorical slots' values are essential to address in order to bridge the gap between spoken and chat-based dialogue systems.
arXiv Detail & Related papers (2023-11-03T08:45:22Z) - DialogStudio: Towards Richest and Most Diverse Unified Dataset
Collection for Conversational AI [92.29874802394167]
DialogStudio is the largest and most diverse collection of dialogue datasets.
Our collection encompasses data from open-domain dialogues, task-oriented dialogues, natural language understanding, conversational recommendation, dialogue summarization, and knowledge-grounded dialogues.
arXiv Detail & Related papers (2023-07-19T17:57:53Z) - CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog
Evaluation [75.60156479374416]
CGoDial is a new challenging and comprehensive Chinese benchmark for Goal-oriented Dialog evaluation.
It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources.
To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing.
arXiv Detail & Related papers (2022-11-21T16:21:41Z) - KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains.
We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z) - Back to the Future: Bidirectional Information Decoupling Network for
Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder.
BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks.
Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z) - Dialogue State Tracking with Multi-Level Fusion of Predicted Dialogue
States and Conversations [2.6529642559155944]
We propose the Dialogue State Tracking with Multi-Level Fusion of Predicted Dialogue States and Conversations network.
This model extracts information of each dialogue turn by modeling interactions among each turn utterance, the corresponding last dialogue states, and dialogue slots.
arXiv Detail & Related papers (2021-07-12T02:30:30Z) - Reasoning in Dialog: Improving Response Generation by Context Reading
Comprehension [49.92173751203827]
In multi-turn dialog, utterances do not always take the full form of sentences.
We propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question.
arXiv Detail & Related papers (2020-12-14T10:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.