MuTual: A Dataset for Multi-Turn Dialogue Reasoning
- URL: http://arxiv.org/abs/2004.04494v1
- Date: Thu, 9 Apr 2020 11:42:33 GMT
- Title: MuTual: A Dataset for Multi-Turn Dialogue Reasoning
- Authors: Leyang Cui, Yu Wu, Shujie Liu, Yue Zhang, Ming Zhou
- Abstract summary: MuTual is a novel dataset for Multi-Turn dialogue Reasoning.
It consists of 8,860 manually annotated dialogues based on Chinese student English listening comprehension exams.
We show that state-of-the-art methods only reach 71%, which is far behind the human performance of 94%.
- Score: 53.10434937685962
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Non-task oriented dialogue systems have achieved great success in recent
years due to largely accessible conversation data and the development of deep
learning techniques. Given a context, current systems are able to yield a
relevant and fluent response, but sometimes make logical mistakes because of
weak reasoning capabilities. To facilitate the conversation reasoning research,
we introduce MuTual, a novel dataset for Multi-Turn dialogue Reasoning,
consisting of 8,860 manually annotated dialogues based on Chinese student
English listening comprehension exams. Compared to previous benchmarks for
non-task oriented dialogue systems, MuTual is much more challenging since it
requires a model that can handle various reasoning problems. Empirical results
show that state-of-the-art methods only reach 71%, which is far behind the
human performance of 94%, indicating that there is ample room for improving
reasoning ability. MuTual is available at https://github.com/Nealcly/MuTual.
Related papers
- Reasoning in Conversation: Solving Subjective Tasks through Dialogue
Simulation for Large Language Models [56.93074140619464]
We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation.
The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales.
We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
arXiv Detail & Related papers (2024-02-27T05:37:10Z) - Crafting a Good Prompt or Providing Exemplary Dialogues? A Study of
In-Context Learning for Persona-based Dialogue Generation [15.143135611057309]
We systematically investigate the ICL capabilities of large language models (LLMs) in persona-based dialogue generation.
From experimental results, we draw three conclusions: 1) adjusting prompt instructions is the most direct, effective, and economical way to improve generation quality; 2) randomly retrieving demonstrations (demos) achieves the best results; and 3) even when we destroy the multi-turn associations and single-turn semantics in the demos, increasing the number of demos still improves dialogue performance.
arXiv Detail & Related papers (2024-02-15T14:03:33Z) - SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented
Dialogue Agents [72.42049370297849]
SpokenWOZ is a large-scale speech-text dataset for spoken TOD.
Cross-turn slot and reasoning slot detection are new challenges for SpokenWOZ.
arXiv Detail & Related papers (2023-05-22T13:47:51Z) - PK-ICR: Persona-Knowledge Interactive Context Retrieval for Grounded Dialogue [21.266410719325208]
Persona and Knowledge Dual Context Identification is a task to identify persona and knowledge jointly for a given dialogue.
We develop a novel grounding retrieval method that utilizes all contexts of dialogue simultaneously.
arXiv Detail & Related papers (2023-02-13T20:27:26Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Controllable Dialogue Simulation with In-Context Learning [39.04491297557292]
textscDialogic is a dialogue simulation method based on large language model in-context learning.
Our method can rapidly expand a small set of dialogue data with minimum or zero human involvement.
Our simulated dialogues have near-human fluency and annotation accuracy.
arXiv Detail & Related papers (2022-10-09T06:32:58Z) - Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data.
Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.