EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with
Large-Scale Pre-Training
- URL: http://arxiv.org/abs/2203.09313v3
- Date: Sat, 21 Oct 2023 15:36:48 GMT
- Title: EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with
Large-Scale Pre-Training
- Authors: Yuxian Gu, Jiaxin Wen, Hao Sun, Yi Song, Pei Ke, Chujie Zheng, Zheng
Zhang, Jianzhu Yao, Lei Liu, Xiaoyan Zhu, Minlie Huang
- Abstract summary: EVA2.0 is a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters.
We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available.
- Score: 73.98154158068134
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale pre-training has shown remarkable performance in building
open-domain dialogue systems. However, previous works mainly focus on showing
and evaluating the conversational performance of the released dialogue model,
ignoring the discussion of some key factors towards a powerful human-like
chatbot, especially in Chinese scenarios. In this paper, we conduct extensive
experiments to investigate these under-explored factors, including data quality
control, model architecture designs, training approaches, and decoding
strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese
dialogue model with 2.8 billion parameters, and will make our models and codes
publicly available. Automatic and human evaluations show that EVA2.0
significantly outperforms other open-source counterparts. We also discuss the
limitations of this work by presenting some failure cases and pose some future
research directions on large-scale Chinese open-domain dialogue systems.
Related papers
- ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented
Instruction Tuning for Digital Human [76.62897301298699]
ChatPLUG is a Chinese open-domain dialogue system for digital human applications that instruction finetunes on a wide range of dialogue tasks in a unified internet-augmented format.
We show that modelname outperforms state-of-the-art Chinese dialogue systems on both automatic and human evaluation.
We deploy modelname to real-world applications such as Smart Speaker and Instant Message applications with fast inference.
arXiv Detail & Related papers (2023-04-16T18:16:35Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Modeling Coreference Relations in Visual Dialog [18.926582410644375]
The occurrences of coreference relations in the dialog makes it a more challenging task than visual question-answering.
We propose two soft constraints that can improve the model's ability of resolving coreferences in dialog in an unsupervised way.
arXiv Detail & Related papers (2022-03-06T15:22:24Z) - Automatic Evaluation and Moderation of Open-domain Dialogue Systems [59.305712262126264]
A long standing challenge that bothers the researchers is the lack of effective automatic evaluation metrics.
This paper describes the data, baselines and results obtained for the Track 5 at the Dialogue System Technology Challenge 10 (DSTC10)
arXiv Detail & Related papers (2021-11-03T10:08:05Z) - EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative
Pre-Training [40.85554509137999]
We propose EVA, a Chinese dialogue system that contains the largest Chinese pre-trained dialogue model with 2.8B parameters.
To build this model, we collect the largest Chinese dialogue dataset named WDC-Dialogue from various public social media.
Experiments on automatic and human evaluation show that EVA outperforms other Chinese pre-trained dialogue models.
arXiv Detail & Related papers (2021-08-03T14:55:24Z) - RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains.
RADDLE is designed to favor and encourage models with a strong generalization ability.
We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z) - Multi-Modal Open-Domain Dialogue [28.69395893943413]
Recent work in open-domain conversational agents has demonstrated that significant improvements in model engagingness and humanness metrics can be achieved via massive scaling.
We investigate combining components from state-of-the-art open-domain dialogue agents with those from state-of-the-art vision models.
We show that our best resulting model outperforms strong existing models in multi-modal dialogue while simultaneously performing as well as its predecessor.
arXiv Detail & Related papers (2020-10-02T16:20:39Z) - Non-Autoregressive Dialog State Tracking [122.2328875457225]
We propose a novel framework of Non-Autoregressive Dialog State Tracking (NADST)
NADST can factor in potential dependencies among domains and slots to optimize the models towards better prediction of dialogue states as a complete set rather than separate slots.
Our results show that our model achieves the state-of-the-art joint accuracy across all domains on the MultiWOZ 2.1 corpus.
arXiv Detail & Related papers (2020-02-19T06:39:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.