Code Switching Language Model Using Monolingual Training Data
- URL: http://arxiv.org/abs/2012.12543v2
- Date: Thu, 24 Dec 2020 02:13:43 GMT
- Title: Code Switching Language Model Using Monolingual Training Data
- Authors: Asad Ullah, Tauseef Ahmed
- Abstract summary: Training a code-switching (CS) language model using only monolingual data is still an ongoing research problem.
In this work, an RNN language model is trained using alternate batches from only monolingual English and Spanish data.
Results were consistently improved using mean square error (MSE) in the output embeddings of RNN based language model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training a code-switching (CS) language model using only monolingual data is
still an ongoing research problem. In this paper, a CS language model is
trained using only monolingual training data. As recurrent neural network (RNN)
models are best suited for predicting sequential data. In this work, an RNN
language model is trained using alternate batches from only monolingual English
and Spanish data and the perplexity of the language model is computed. From the
results, it is concluded that using alternate batches of monolingual data in
training reduced the perplexity of a CS language model. The results were
consistently improved using mean square error (MSE) in the output embeddings of
RNN based language model. By combining both methods, perplexity is reduced from
299.63 to 80.38. The proposed methods were comparable to the language model
fine tune with code-switch training data.
Related papers
- Switch Point biased Self-Training: Re-purposing Pretrained Models for
Code-Switching [44.034300203700234]
Code-switching is a ubiquitous phenomenon due to the ease of communication it offers in multilingual communities.
We propose a self training method to repurpose the existing pretrained models using a switch-point bias.
Our approach performs well on both tasks by reducing the gap between the switch point performance.
arXiv Detail & Related papers (2021-11-01T19:42:08Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Exploring Monolingual Data for Neural Machine Translation with Knowledge
Distillation [10.745228927771915]
We explore two types of monolingual data that can be included in knowledge distillation training for neural machine translation (NMT)
We find that source-side monolingual data improves model performance when evaluated by test-set originated from source-side.
We also show that it is not required to train the student model with the same data used by the teacher, as long as the domains are the same.
arXiv Detail & Related papers (2020-12-31T05:28:42Z) - A Hybrid Approach for Improved Low Resource Neural Machine Translation
using Monolingual Data [0.0]
Many language pairs are low resource, meaning the amount and/or quality of available parallel data is not sufficient to train a neural machine translation (NMT) model.
This work proposes a novel approach that enables both the backward and forward models to benefit from the monolingual target data.
arXiv Detail & Related papers (2020-11-14T22:18:45Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Unsupervised Pretraining for Neural Machine Translation Using Elastic
Weight Consolidation [0.0]
This work presents our ongoing research of unsupervised pretraining in neural machine translation (NMT)
In our method, we initialize the weights of the encoder and decoder with two language models that are trained with monolingual data.
We show that initializing the bidirectional NMT encoder with a left-to-right language model and forcing the model to remember the original left-to-right language modeling task limits the learning capacity of the encoder.
arXiv Detail & Related papers (2020-10-19T11:51:45Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - Reusing a Pretrained Language Model on Languages with Limited Corpora
for Unsupervised NMT [129.99918589405675]
We present an effective approach that reuses an LM that is pretrained only on the high-resource language.
The monolingual LM is fine-tuned on both languages and is then used to initialize a UNMT model.
Our approach, RE-LM, outperforms a competitive cross-lingual pretraining model (XLM) in English-Macedonian (En-Mk) and English-Albanian (En-Sq)
arXiv Detail & Related papers (2020-09-16T11:37:10Z) - Rnn-transducer with language bias for end-to-end Mandarin-English
code-switching speech recognition [58.105818353866354]
We propose an improved recurrent neural network transducer (RNN-T) model with language bias to alleviate the problem.
We use the language identities to bias the model to predict the CS points.
This promotes the model to learn the language identity information directly from transcription, and no additional LID model is needed.
arXiv Detail & Related papers (2020-02-19T12:01:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.