Multilingual Transfer Learning for Code-Switched Language and Speech
Neural Modeling
- URL: http://arxiv.org/abs/2104.06268v1
- Date: Tue, 13 Apr 2021 14:49:26 GMT
- Title: Multilingual Transfer Learning for Code-Switched Language and Speech
Neural Modeling
- Authors: Genta Indra Winata
- Abstract summary: We address the data scarcity and limitations of linguistic theory by proposing language-agnostic multi-task training methods.
First, we introduce a meta-learning-based approach, meta-transfer learning, in which information is judiciously extracted from high-resource monolingual speech data to the code-switching domain.
Second, we propose a novel multilingual meta-ems approach to effectively represent code-switching data by acquiring useful knowledge learned in other languages.
Third, we introduce multi-task learning to integrate syntactic information as a transfer learning strategy to a language model and learn where to code-switch.
- Score: 12.497781134446898
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this thesis, we address the data scarcity and limitations of linguistic
theory by proposing language-agnostic multi-task training methods. First, we
introduce a meta-learning-based approach, meta-transfer learning, in which
information is judiciously extracted from high-resource monolingual speech data
to the code-switching domain. The meta-transfer learning quickly adapts the
model to the code-switching task from a number of monolingual tasks by learning
to learn in a multi-task learning fashion. Second, we propose a novel
multilingual meta-embeddings approach to effectively represent code-switching
data by acquiring useful knowledge learned in other languages, learning the
commonalities of closely related languages and leveraging lexical composition.
The method is far more efficient compared to contextualized pre-trained
multilingual models. Third, we introduce multi-task learning to integrate
syntactic information as a transfer learning strategy to a language model and
learn where to code-switch. To further alleviate the aforementioned issues, we
propose a data augmentation method using Pointer-Gen, a neural network using a
copy mechanism to teach the model the code-switch points from monolingual
parallel sentences. We disentangle the need for linguistic theory, and the
model captures code-switching points by attending to input words and aligning
the parallel words, without requiring any word alignments or constituency
parsers. More importantly, the model can be effectively used for languages that
are syntactically different, and it outperforms the linguistic theory-based
models.
Related papers
- Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Simple yet Effective Code-Switching Language Identification with
Multitask Pre-Training and Transfer Learning [0.7242530499990028]
Code-switching is the linguistics phenomenon where in casual settings, multilingual speakers mix words from different languages in one utterance.
We propose two novel approaches toward improving language identification accuracy on an English-Mandarin child-directed speech dataset.
Our best model achieves a balanced accuracy of 0.781 on a real English-Mandarin code-switching child-directed speech corpus and outperforms the previous baseline by 55.3%.
arXiv Detail & Related papers (2023-05-31T11:43:16Z) - On the cross-lingual transferability of multilingual prototypical models
across NLU tasks [2.44288434255221]
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven to be effective for limited domain and language applications.
In practice, these approaches suffer from the drawbacks of domain-driven design and under-resourced languages.
This article proposes to investigate the cross-lingual transferability of using synergistically few-shot learning with prototypical neural networks and multilingual Transformers-based models.
arXiv Detail & Related papers (2022-07-19T09:55:04Z) - MetaTPTrans: A Meta Learning Approach for Multilingual Code
Representation Learning [5.434698132994918]
We propose MetaTPTrans, a meta learning approach for multilingual code representation learning.
We show that MetaTPTrans improves the F1 score of state-of-the-art approaches significantly.
arXiv Detail & Related papers (2022-06-13T20:36:42Z) - Exploring Teacher-Student Learning Approach for Multi-lingual
Speech-to-Intent Classification [73.5497360800395]
We develop an end-to-end system that supports multiple languages.
We exploit knowledge from a pre-trained multi-lingual natural language processing model.
arXiv Detail & Related papers (2021-09-28T04:43:11Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language
Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training.
We propose a new pre-training task based on contrastive learning.
By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - Meta-Transfer Learning for Code-Switched Speech Recognition [72.84247387728999]
We propose a new learning method, meta-transfer learning, to transfer learn on a code-switched speech recognition system in a low-resource setting.
Our model learns to recognize individual languages, and transfer them so as to better recognize mixed-language speech by conditioning the optimization on the code-switching data.
arXiv Detail & Related papers (2020-04-29T14:27:19Z) - Zero-Shot Cross-Lingual Transfer with Meta Learning [45.29398184889296]
We consider the setting of training models on multiple languages at the same time, when little or no data is available for languages other than English.
We show that this challenging setup can be approached using meta-learning.
We experiment using standard supervised, zero-shot cross-lingual, as well as few-shot cross-lingual settings for different natural language understanding tasks.
arXiv Detail & Related papers (2020-03-05T16:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.