Multilingual Code-Switching for Zero-Shot Cross-Lingual Intent
Prediction and Slot Filling
- URL: http://arxiv.org/abs/2103.07792v2
- Date: Tue, 16 Mar 2021 16:39:48 GMT
- Title: Multilingual Code-Switching for Zero-Shot Cross-Lingual Intent
Prediction and Slot Filling
- Authors: Jitin Krishnan, Antonios Anastasopoulos, Hemant Purohit, and Huzefa
Rangwala
- Abstract summary: We propose a novel method to augment the monolingual source data using multilingual code-switching via random translations.
Experiments on the benchmark dataset of MultiATIS++ yielded an average improvement of +4.2% in accuracy for intent task and +1.8% in F1 for slot task.
We present an application of our method for crisis informatics using a new human-annotated tweet dataset of slot filling in English and Haitian Creole.
- Score: 29.17194639368877
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predicting user intent and detecting the corresponding slots from text are
two key problems in Natural Language Understanding (NLU). In the context of
zero-shot learning, this task is typically approached by either using
representations from pre-trained multilingual transformers such as mBERT, or by
machine translating the source data into the known target language and then
fine-tuning. Our work focuses on a particular scenario where the target
language is unknown during training. To this goal, we propose a novel method to
augment the monolingual source data using multilingual code-switching via
random translations to enhance a transformer's language neutrality when
fine-tuning it for a downstream task. This method also helps discover novel
insights on how code-switching with different language families around the
world impact the performance on the target language. Experiments on the
benchmark dataset of MultiATIS++ yielded an average improvement of +4.2% in
accuracy for intent task and +1.8% in F1 for slot task using our method over
the state-of-the-art across 8 different languages. Furthermore, we present an
application of our method for crisis informatics using a new human-annotated
tweet dataset of slot filling in English and Haitian Creole, collected during
Haiti earthquake disaster.
Related papers
- Low-Resource Machine Translation through the Lens of Personalized Federated Learning [26.436144338377755]
We present a new approach that can be applied to Natural Language Tasks with heterogeneous data.
We evaluate it on the Low-Resource Machine Translation task, using the dataset from the Large-Scale Multilingual Machine Translation Shared Task.
In addition to its effectiveness, MeritFed is also highly interpretable, as it can be applied to track the impact of each language used for training.
arXiv Detail & Related papers (2024-06-18T12:50:00Z) - Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer [92.80671770992572]
Cross-lingual transfer is a central task in multilingual NLP.
Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data.
We propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer.
arXiv Detail & Related papers (2023-09-19T19:30:56Z) - Simple yet Effective Code-Switching Language Identification with
Multitask Pre-Training and Transfer Learning [0.7242530499990028]
Code-switching is the linguistics phenomenon where in casual settings, multilingual speakers mix words from different languages in one utterance.
We propose two novel approaches toward improving language identification accuracy on an English-Mandarin child-directed speech dataset.
Our best model achieves a balanced accuracy of 0.781 on a real English-Mandarin code-switching child-directed speech corpus and outperforms the previous baseline by 55.3%.
arXiv Detail & Related papers (2023-05-31T11:43:16Z) - CONCRETE: Improving Cross-lingual Fact-checking with Cross-lingual
Retrieval [73.48591773882052]
Most fact-checking approaches focus on English only due to the data scarcity issue in other languages.
We present the first fact-checking framework augmented with crosslingual retrieval.
We train the retriever with our proposed Crosslingual Inverse Cloze Task (XICT)
arXiv Detail & Related papers (2022-09-05T17:36:14Z) - Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation [48.80125962015044]
We investigate the problem of performing a generative task (i.e., summarization) in a target language when labeled data is only available in English.
We find that parameter-efficient adaptation provides gains over standard fine-tuning when transferring between less-related languages.
Our methods can provide further quality gains, suggesting that robust zero-shot cross-lingual generation is within reach.
arXiv Detail & Related papers (2022-05-25T10:41:34Z) - Por Qu\'e N\~ao Utiliser Alla Spr{\aa}k? Mixed Training with Gradient
Optimization in Few-Shot Cross-Lingual Transfer [2.7213511121305465]
We propose a one-step mixed training method that trains on both source and target data.
We use one model to handle all target languages simultaneously to avoid excessively language-specific models.
Our proposed method achieves state-of-the-art performance on all tasks and outperforms target-adapting by a large margin.
arXiv Detail & Related papers (2022-04-29T04:05:02Z) - From Masked Language Modeling to Translation: Non-English Auxiliary
Tasks Improve Zero-shot Spoken Language Understanding [24.149299722716155]
We introduce xSID, a new benchmark for cross-lingual Slot and Intent Detection in 13 languages from 6 language families, including a very low-resource dialect.
We propose a joint learning approach, with English SLU training data and non-English auxiliary tasks from raw text, syntax and translation for transfer.
Our results show that jointly learning the main tasks with masked language modeling is effective for slots, while machine translation transfer works best for intent classification.
arXiv Detail & Related papers (2021-05-15T23:51:11Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - Zero-Shot Cross-Lingual Transfer with Meta Learning [45.29398184889296]
We consider the setting of training models on multiple languages at the same time, when little or no data is available for languages other than English.
We show that this challenging setup can be approached using meta-learning.
We experiment using standard supervised, zero-shot cross-lingual, as well as few-shot cross-lingual settings for different natural language understanding tasks.
arXiv Detail & Related papers (2020-03-05T16:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.