MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue
Generation
- URL: http://arxiv.org/abs/2208.08629v1
- Date: Thu, 18 Aug 2022 04:28:20 GMT
- Title: MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue
Generation
- Authors: Yongkang Liu, Shi Feng, Daling Wang, Yifei Zhang
- Abstract summary: MulZDG can effectively transfer knowledge from an English corpus with large-scale training samples to a non-English corpus with zero samples.
First, we construct multilingual code-switching dialogue datasets via translation utterances randomly selected from monolingual English datasets.
Then we employ MulZDG to train a unified multilingual dialogue model based on the code-switching datasets.
- Score: 23.711903266714508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building dialogue generation systems in a zero-shot scenario remains a huge
challenge, since the typical zero-shot approaches in dialogue generation rely
heavily on large-scale pre-trained language generation models such as GPT-3 and
T5. The research on zero-shot dialogue generation without cumbersome language
models is limited due to lacking corresponding parallel dialogue corpora. In
this paper, we propose a simple but effective Multilingual learning framework
for Zero-shot Dialogue Generation (dubbed as MulZDG) that can effectively
transfer knowledge from an English corpus with large-scale training samples to
a non-English corpus with zero samples. Besides, MulZDG can be viewed as a
multilingual data augmentation method to improve the performance of the
resource-rich language. First, we construct multilingual code-switching
dialogue datasets via translation utterances randomly selected from monolingual
English datasets. Then we employ MulZDG to train a unified multilingual
dialogue model based on the code-switching datasets. The MulZDG can conduct
implicit semantic alignment between different languages. Experiments on
DailyDialog and DSTC7 datasets demonstrate that MulZDG not only achieve
competitive performance under zero-shot case compared to training with
sufficient examples but also greatly improve the performance of the source
language.
Related papers
- ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language [53.8622516025736]
We propose a novel end-to-end zero-shot dialogue generation model ChatZero based on cross-lingual code-switching method.
Experiments on the multilingual DailyDialog and DSTC7-AVSD datasets demonstrate that ChatZero can achieve more than 90% of the original performance.
arXiv Detail & Related papers (2024-08-16T13:11:53Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining
for Task-Oriented Dialog [67.20796950016735]
Multi2WOZ dataset spans four typologically diverse languages: Chinese, German, Arabic, and Russian.
We introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks.
Our experiments show that, in most setups, the best performance entails the combination of (I) conversational specialization in the target language and (ii) few-shot transfer for the concrete TOD task.
arXiv Detail & Related papers (2022-05-20T18:35:38Z) - Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation [70.81596088969378]
Cross-lingual Outline-based Dialogue dataset (termed COD) enables natural language understanding.
COD enables dialogue state tracking, and end-to-end dialogue modelling and evaluation in 4 diverse languages.
arXiv Detail & Related papers (2022-01-31T18:11:21Z) - Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking [84.50302759362698]
We enhance the transfer learning process by intermediate fine-tuning of pretrained multilingual models.
We use parallel and conversational movie subtitles datasets to design cross-lingual intermediate tasks.
We achieve impressive improvements (> 20% on goal accuracy) on the parallel MultiWoZ dataset and Multilingual WoZ dataset.
arXiv Detail & Related papers (2021-09-28T11:22:38Z) - Adapting Monolingual Models: Data can be Scarce when Language Similarity
is High [3.249853429482705]
We investigate the performance of zero-shot transfer learning with as little data as possible.
We retrain the lexical layers of four BERT-based models using data from two low-resource target language varieties.
With high language similarity, 10MB of data appears sufficient to achieve substantial monolingual transfer performance.
arXiv Detail & Related papers (2021-05-06T17:43:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.