ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language
- URL: http://arxiv.org/abs/2408.08724v1
- Date: Fri, 16 Aug 2024 13:11:53 GMT
- Title: ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language
- Authors: Yongkang Liu, Feng Shi, Daling Wang, Yifei Zhang, Hinrich Schütze,
- Abstract summary: We propose a novel end-to-end zero-shot dialogue generation model ChatZero based on cross-lingual code-switching method.
Experiments on the multilingual DailyDialog and DSTC7-AVSD datasets demonstrate that ChatZero can achieve more than 90% of the original performance.
- Score: 53.8622516025736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although large language models(LLMs) show amazing capabilities, among various exciting applications discovered for LLMs fall short in other low-resource languages. Besides, most existing methods depend on large-scale dialogue corpora and thus building systems for dialogue generation in a zero-shot scenario remains a considerable challenge. To address this challenge, we propose a novel end-to-end zero-shot dialogue generation model ChatZero based on cross-lingual code-switching method. First, we construct code-switching language and pseudo-target language with placeholders. Then for cross-lingual semantic transfer, we employ unsupervised contrastive learning to minimize the semantics gap of the source language, code-switching language, and pseudo-target language that are mutually positive examples in the high dimensional semantic space. Experiments on the multilingual DailyDialog and DSTC7-AVSD datasets demonstrate that ChatZero can achieve more than 90\% of the original performance under the zero-shot case compared to supervised learning, and achieve state-of-the-art performance compared with other baselines.
Related papers
- Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs)
It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs.
It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z) - LaDA: Latent Dialogue Action For Zero-shot Cross-lingual Neural Network
Language Modeling [20.002861239367704]
Cross-lingual adaptation has proven effective in spoken language understanding systems with limited resources.
Existing methods are frequently unsatisfactory for intent detection and slot filling.
Latent Dialogue Action layer is proposed to optimize decoding strategy.
arXiv Detail & Related papers (2023-08-05T15:51:45Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue
Generation [23.711903266714508]
MulZDG can effectively transfer knowledge from an English corpus with large-scale training samples to a non-English corpus with zero samples.
First, we construct multilingual code-switching dialogue datasets via translation utterances randomly selected from monolingual English datasets.
Then we employ MulZDG to train a unified multilingual dialogue model based on the code-switching datasets.
arXiv Detail & Related papers (2022-08-18T04:28:20Z) - Multi-level Contrastive Learning for Cross-lingual Spoken Language
Understanding [90.87454350016121]
We develop novel code-switching schemes to generate hard negative examples for contrastive learning at all levels.
We develop a label-aware joint model to leverage label semantics for cross-lingual knowledge transfer.
arXiv Detail & Related papers (2022-05-07T13:44:28Z) - Meta-X$_{NLG}$: A Meta-Learning Approach Based on Language Clustering
for Zero-Shot Cross-Lingual Transfer and Generation [11.155430893354769]
This paper proposes a novel meta-learning framework to learn shareable structures from typologically diverse languages.
We first cluster the languages based on language representations and identify the centroid language of each cluster.
A meta-learning algorithm is trained with all centroid languages and evaluated on the other languages in the zero-shot setting.
arXiv Detail & Related papers (2022-03-19T05:22:07Z) - Zero-Shot Cross-lingual Semantic Parsing [56.95036511882921]
We study cross-lingual semantic parsing as a zero-shot problem without parallel data for 7 test languages.
We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-Logical form paired data.
Our system frames zero-shot parsing as a latent-space alignment problem and finds that pre-trained models can be improved to generate logical forms with minimal cross-lingual transfer penalty.
arXiv Detail & Related papers (2021-04-15T16:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.