ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and
Diagnosing Dialogue Systems
- URL: http://arxiv.org/abs/2002.04793v2
- Date: Wed, 29 Apr 2020 14:02:43 GMT
- Title: ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and
Diagnosing Dialogue Systems
- Authors: Qi Zhu, Zheng Zhang, Yan Fang, Xiang Li, Ryuichi Takanobu, Jinchao Li,
Baolin Peng, Jianfeng Gao, Xiaoyan Zhu, Minlie Huang
- Abstract summary: ConvLab-2 is an open-source toolkit that enables researchers to build task-oriented dialogue systems with state-of-the-art models.
The analysis tool presents rich statistics and summarizes common mistakes from simulated dialogues.
The interactive tool allows developers to diagnose an assembled dialogue system by interacting with the system and modifying the output of each system component.
- Score: 107.35174238206525
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present ConvLab-2, an open-source toolkit that enables researchers to
build task-oriented dialogue systems with state-of-the-art models, perform an
end-to-end evaluation, and diagnose the weakness of systems. As the successor
of ConvLab (Lee et al., 2019b), ConvLab-2 inherits ConvLab's framework but
integrates more powerful dialogue models and supports more datasets. Besides,
we have developed an analysis tool and an interactive tool to assist
researchers in diagnosing dialogue systems. The analysis tool presents rich
statistics and summarizes common mistakes from simulated dialogues, which
facilitates error analysis and system improvement. The interactive tool
provides a user interface that allows developers to diagnose an assembled
dialogue system by interacting with the system and modifying the output of each
system component.
Related papers
- Are cascade dialogue state tracking models speaking out of turn in
spoken dialogues? [1.786898113631979]
This paper proposes a comprehensive analysis of the errors of state of the art systems in complex settings such as Dialogue State Tracking.
Based on spoken MultiWoz, we identify that errors on non-categorical slots' values are essential to address in order to bridge the gap between spoken and chat-based dialogue systems.
arXiv Detail & Related papers (2023-11-03T08:45:22Z) - ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data
Format [88.33443450434521]
Task-oriented dialogue (TOD) systems function as digital assistants, guiding users through various tasks such as booking flights or finding restaurants.
Existing toolkits for building TOD systems often fall short of in delivering comprehensive arrays of data, models, and experimental environments.
We introduce ConvLab-3: a multifaceted dialogue system toolkit crafted to bridge this gap.
arXiv Detail & Related papers (2022-11-30T16:37:42Z) - CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog
Evaluation [75.60156479374416]
CGoDial is a new challenging and comprehensive Chinese benchmark for Goal-oriented Dialog evaluation.
It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources.
To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing.
arXiv Detail & Related papers (2022-11-21T16:21:41Z) - Manual-Guided Dialogue for Flexible Conversational Agents [84.46598430403886]
How to build and use dialogue data efficiently, and how to deploy models in different domains at scale can be critical issues in building a task-oriented dialogue system.
We propose a novel manual-guided dialogue scheme, where the agent learns the tasks from both dialogue and manuals.
Our proposed scheme reduces the dependence of dialogue models on fine-grained domain ontology, and makes them more flexible to adapt to various domains.
arXiv Detail & Related papers (2022-08-16T08:21:12Z) - Actionable Conversational Quality Indicators for Improving Task-Oriented
Dialog Systems [2.6094079735487994]
This paper introduces and explains the use of Actionable Conversational Quality Indicators (ACQIs)
ACQIs are used both to recognize parts of dialogs that can be improved, and to recommend how to improve them.
We demonstrate the effectiveness of using ACQIs on LivePerson internal dialog systems used in commercial customer service applications.
arXiv Detail & Related papers (2021-09-22T22:41:42Z) - Transferable Dialogue Systems and User Simulators [17.106518400787156]
One of the difficulties in training dialogue systems is the lack of training data.
We explore the possibility of creating dialogue data through the interaction between a dialogue system and a user simulator.
We develop a modelling framework that can incorporate new dialogue scenarios through self-play between the two agents.
arXiv Detail & Related papers (2021-07-25T22:59:09Z) - Is Your Goal-Oriented Dialog Model Performing Really Well? Empirical
Analysis of System-wise Evaluation [114.48767388174218]
This paper presents an empirical analysis on different types of dialog systems composed of different modules in different settings.
Our results show that a pipeline dialog system trained using fine-grained supervision signals at different component levels often obtains better performance than the systems that use joint or end-to-end models trained on coarse-grained labels.
arXiv Detail & Related papers (2020-05-15T05:20:06Z) - Conversation Learner -- A Machine Teaching Tool for Building Dialog
Managers for Task-Oriented Dialog Systems [57.082447660944965]
Conversation Learner is a machine teaching tool for building dialog managers.
It enables dialog authors to create a dialog flow using familiar tools, converting the dialog flow into a parametric model.
It allows dialog authors to improve the dialog manager over time by leveraging user-system dialog logs as training data.
arXiv Detail & Related papers (2020-04-09T00:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.