Investigating Effect of Dialogue History in Multilingual Task Oriented
Dialogue Systems
- URL: http://arxiv.org/abs/2112.12318v1
- Date: Thu, 23 Dec 2021 02:27:10 GMT
- Title: Investigating Effect of Dialogue History in Multilingual Task Oriented
Dialogue Systems
- Authors: Michael Sun, Kaili Huang, and Mehrad Moradshahi
- Abstract summary: Up to Dec 2021, Alexa, one of the most popular smart speakers around the world, is able to support 9 different languages.
Training a virtual assistant in other languages is often more difficult, especially for those low-resource languages.
We devise an efficient and effective training solution for multilingual task-orientated dialogue systems.
- Score: 2.695466667982714
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While the English virtual assistants have achieved exciting performance with
an enormous amount of training resources, the needs of non-English-speakers
have not been satisfied well. Up to Dec 2021, Alexa, one of the most popular
smart speakers around the world, is able to support 9 different languages [1],
while there are thousands of languages in the world, 91 of which are spoken by
more than 10 million people according to statistics published in 2019 [2].
However, training a virtual assistant in other languages than English is often
more difficult, especially for those low-resource languages. The lack of
high-quality training data restricts the performance of models, resulting in
poor user satisfaction. Therefore, we devise an efficient and effective
training solution for multilingual task-orientated dialogue systems, using the
same dataset generation pipeline and end-to-end dialogue system architecture as
BiToD[5], which adopted some key design choices for a minimalistic natural
language design where formal dialogue states are used in place of natural
language inputs. This reduces the room for error brought by weaker natural
language models, and ensures the model can correctly extract the essential slot
values needed to perform dialogue state tracking (DST). Our goal is to reduce
the amount of natural language encoded at each turn, and the key parameter we
investigate is the number of turns (H) to feed as history to model. We first
explore the turning point where increasing H begins to yield limiting returns
on the overall performance. Then we examine whether the examples a model with
small H gets wrong can be categorized in a way for the model to do few-shot
finetuning on. Lastly, will explore the limitations of this approach, and
whether there is a certain type of examples that this approach will not be able
to resolve.
Related papers
- ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language [53.8622516025736]
We propose a novel end-to-end zero-shot dialogue generation model ChatZero based on cross-lingual code-switching method.
Experiments on the multilingual DailyDialog and DSTC7-AVSD datasets demonstrate that ChatZero can achieve more than 90% of the original performance.
arXiv Detail & Related papers (2024-08-16T13:11:53Z) - Adapting Multilingual Speech Representation Model for a New,
Underresourced Language through Multilingual Fine-tuning and Continued
Pretraining [2.3513645401551333]
We investigate the possibility for adapting an existing multilingual wav2vec 2.0 model for a new language.
Our results show that continued pretraining is the most effective method to adapt a wav2vec 2.0 model for a new language.
We find that if a model pretrained on a related speech variety or an unrelated language with similar phonological characteristics is available, multilingual fine-tuning using additional data from that language can have positive impact on speech recognition performance.
arXiv Detail & Related papers (2023-01-18T03:57:53Z) - Low-Resource Multilingual and Zero-Shot Multispeaker TTS [25.707717591185386]
We show that it is possible for a system to learn speaking a new language using just 5 minutes of training data.
We show the success of our proposed approach in terms of intelligibility, naturalness and similarity to target speaker.
arXiv Detail & Related papers (2022-10-21T20:03:37Z) - Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation [70.81596088969378]
Cross-lingual Outline-based Dialogue dataset (termed COD) enables natural language understanding.
COD enables dialogue state tracking, and end-to-end dialogue modelling and evaluation in 4 diverse languages.
arXiv Detail & Related papers (2022-01-31T18:11:21Z) - Neural Models for Offensive Language Detection [0.0]
Offensive language detection is an ever-growing natural language processing (NLP) application.
We believe contributing to improving and comparing different machine learning models to fight such harmful contents is an important and challenging goal for this thesis.
arXiv Detail & Related papers (2021-05-30T13:02:45Z) - Crossing the Conversational Chasm: A Primer on Multilingual
Task-Oriented Dialogue Systems [51.328224222640614]
Current state-of-the-art ToD models based on large pretrained neural language models are data hungry.
Data acquisition for ToD use cases is expensive and tedious.
arXiv Detail & Related papers (2021-04-17T15:19:56Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems [74.8759568242933]
Task-oriented dialogue systems use four connected modules, namely, Natural Language Understanding (NLU), a Dialogue State Tracking (DST), Dialogue Policy (DP) and Natural Language Generation (NLG)
A research challenge is to learn each module with the least amount of samples given the high cost related to the data collection.
We evaluate the priming few-shot ability of language models in the NLU, DP and NLG tasks.
arXiv Detail & Related papers (2020-08-14T08:23:21Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z) - Deep Learning Models for Multilingual Hate Speech Detection [5.977278650516324]
In this paper, we conduct a large scale analysis of multilingual hate speech in 9 languages from 16 different sources.
We observe that in low resource setting, simple models such as LASER embedding with logistic regression performs the best.
In case of zero-shot classification, languages such as Italian and Portuguese achieve good results.
arXiv Detail & Related papers (2020-04-14T13:14:27Z) - From English To Foreign Languages: Transferring Pre-trained Language
Models [0.12691047660244334]
Pre-trained models have demonstrated their effectiveness in many downstream natural language processing (NLP) tasks.
The availability of multilingual pre-trained models enables zero-shot transfer of NLP tasks from high resource languages to low resource ones.
We tackle the problem of transferring an existing pre-trained model from English to other languages under a limited computational budget.
arXiv Detail & Related papers (2020-02-18T00:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.