An Empirical Study of Cross-Lingual Transferability in Generative
Dialogue State Tracker
- URL: http://arxiv.org/abs/2101.11360v1
- Date: Wed, 27 Jan 2021 12:45:55 GMT
- Title: An Empirical Study of Cross-Lingual Transferability in Generative
Dialogue State Tracker
- Authors: Yen-Ting Lin, Yun-Nung Chen
- Abstract summary: We study the transferability of a cross-lingual generative dialogue state tracking system using a multilingual pre-trained seq2seq model.
We also find out the low cross-lingual transferability of our approaches and provides investigation and discussion.
- Score: 33.2309643963072
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been a rapid development in data-driven task-oriented dialogue
systems with the benefit of large-scale datasets. However, the progress of
dialogue systems in low-resource languages lags far behind due to the lack of
high-quality data. To advance the cross-lingual technology in building dialog
systems, DSTC9 introduces the task of cross-lingual dialog state tracking,
where we test the DST module in a low-resource language given the rich-resource
training dataset.
This paper studies the transferability of a cross-lingual generative dialogue
state tracking system using a multilingual pre-trained seq2seq model. We
experiment under different settings, including joint-training or pre-training
on cross-lingual and cross-ontology datasets. We also find out the low
cross-lingual transferability of our approaches and provides investigation and
discussion.
Related papers
- Cross-lingual Data Augmentation for Document-grounded Dialog Systems in
Low Resource Languages [0.0]
We present a novel pipeline CLEM (Cross-Lingual Enhanced Model) including adversarial training retrieval (Retriever and Re-ranker) and Fid (fusion-in-decoder) generator.
To further leverage high-resource language, we also propose an innovative architecture to conduct alignment across different languages with translated training.
arXiv Detail & Related papers (2023-05-24T09:40:52Z) - Is Translation Helpful? An Empirical Analysis of Cross-Lingual Transfer
in Low-Resource Dialog Generation [21.973937517854935]
Cross-lingual transfer is important for developing high-quality chatbots in multiple languages.
In this work, we investigate whether it is helpful to utilize machine translation (MT) at all in this task.
Experiments show that leveraging English dialog corpora can indeed improve the naturalness, relevance and cross-domain transferability in Chinese.
arXiv Detail & Related papers (2023-05-21T15:07:04Z) - Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining
for Task-Oriented Dialog [67.20796950016735]
Multi2WOZ dataset spans four typologically diverse languages: Chinese, German, Arabic, and Russian.
We introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks.
Our experiments show that, in most setups, the best performance entails the combination of (I) conversational specialization in the target language and (ii) few-shot transfer for the concrete TOD task.
arXiv Detail & Related papers (2022-05-20T18:35:38Z) - Scheduled Multi-task Learning for Neural Chat Translation [66.81525961469494]
We propose a scheduled multi-task learning framework for Neural Chat Translation (NCT)
Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training.
Extensive experiments in four language directions verify the effectiveness and superiority of the proposed approach.
arXiv Detail & Related papers (2022-05-08T02:57:28Z) - A Study on Prompt-based Few-Shot Learning Methods for Belief State
Tracking in Task-oriented Dialog Systems [10.024834304960846]
We tackle the Dialogue Belief State Tracking problem of task-oriented conversational systems.
Recent approaches to this problem leveraging Transformer-based models have yielded great results.
We explore prompt-based few-shot learning for Dialogue Belief State Tracking.
arXiv Detail & Related papers (2022-04-18T05:29:54Z) - Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation [70.81596088969378]
Cross-lingual Outline-based Dialogue dataset (termed COD) enables natural language understanding.
COD enables dialogue state tracking, and end-to-end dialogue modelling and evaluation in 4 diverse languages.
arXiv Detail & Related papers (2022-01-31T18:11:21Z) - GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented
Dialogue Systems [66.92182084456809]
We introduce a novel data curation method that generates GlobalWoZ -- a large-scale multilingual ToD dataset from an English ToD dataset.
Our method is based on translating dialogue templates and filling them with local entities in the target-language countries.
We release our dataset as well as a set of strong baselines to encourage research on learning multilingual ToD systems for real use cases.
arXiv Detail & Related papers (2021-10-14T19:33:04Z) - Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking [84.50302759362698]
We enhance the transfer learning process by intermediate fine-tuning of pretrained multilingual models.
We use parallel and conversational movie subtitles datasets to design cross-lingual intermediate tasks.
We achieve impressive improvements (> 20% on goal accuracy) on the parallel MultiWoZ dataset and Multilingual WoZ dataset.
arXiv Detail & Related papers (2021-09-28T11:22:38Z) - Efficient Dialogue State Tracking by Masked Hierarchical Transformer [0.3441021278275805]
We build a Cross-lingual dialog state tracker with a training set in rich resource language and a testing set in low resource language.
We formulate a method for joint learning of slot operation classification task and state tracking task.
arXiv Detail & Related papers (2021-06-28T07:35:49Z) - BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue
Modeling [52.99188200886738]
BiToD is the first bilingual multi-domain dataset for end-to-end task-oriented dialogue modeling.
BiToD contains over 7k multi-domain dialogues (144k utterances) with a large and realistic bilingual knowledge base.
arXiv Detail & Related papers (2021-06-05T03:38:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.