Switch Point biased Self-Training: Re-purposing Pretrained Models for
Code-Switching
- URL: http://arxiv.org/abs/2111.01231v1
- Date: Mon, 1 Nov 2021 19:42:08 GMT
- Title: Switch Point biased Self-Training: Re-purposing Pretrained Models for
Code-Switching
- Authors: Parul Chopra, Sai Krishna Rallabandi, Alan W Black, Khyathi Raghavi
Chandu
- Abstract summary: Code-switching is a ubiquitous phenomenon due to the ease of communication it offers in multilingual communities.
We propose a self training method to repurpose the existing pretrained models using a switch-point bias.
Our approach performs well on both tasks by reducing the gap between the switch point performance.
- Score: 44.034300203700234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Code-switching (CS), a ubiquitous phenomenon due to the ease of communication
it offers in multilingual communities still remains an understudied problem in
language processing. The primary reasons behind this are: (1) minimal efforts
in leveraging large pretrained multilingual models, and (2) the lack of
annotated data. The distinguishing case of low performance of multilingual
models in CS is the intra-sentence mixing of languages leading to switch
points. We first benchmark two sequence labeling tasks -- POS and NER on 4
different language pairs with a suite of pretrained models to identify the
problems and select the best performing model, char-BERT, among them
(addressing (1)). We then propose a self training method to repurpose the
existing pretrained models using a switch-point bias by leveraging unannotated
data (addressing (2)). We finally demonstrate that our approach performs well
on both tasks by reducing the gap between the switch point performance while
retaining the overall performance on two distinct language pairs in both the
tasks. Our code is available here:
https://github.com/PC09/EMNLP2021-Switch-Point-biased-Self-Training.
Related papers
- VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Cross-Lingual Text Classification with Multilingual Distillation and
Zero-Shot-Aware Training [21.934439663979663]
Multi-branch multilingual language model (MBLM) built on Multilingual pre-trained language models (MPLMs)
Method based on transferring knowledge from high-performance monolingual models with a teacher-student framework.
Results on two cross-lingual classification tasks show that, with only the task's supervised data used, our method improves both the supervised and zero-shot performance of MPLMs.
arXiv Detail & Related papers (2022-02-28T09:51:32Z) - Continual Learning in Multilingual NMT via Language-Specific Embeddings [92.91823064720232]
It consists in replacing the shared vocabulary with a small language-specific vocabulary and fine-tuning the new embeddings on the new language's parallel data.
Because the parameters of the original model are not modified, its performance on the initial languages does not degrade.
arXiv Detail & Related papers (2021-10-20T10:38:57Z) - Call Larisa Ivanovna: Code-Switching Fools Multilingual NLU Models [1.827510863075184]
Novel benchmarks for multilingual natural language understanding (NLU) include monolingual sentences in several languages, annotated with intents and slots.
Existing benchmarks lack of code-switched utterances, which are difficult to gather and label due to complexity in the grammatical structure.
Our work adopts recognized methods to generate plausible and naturally-sounding code-switched utterances and uses them to create a synthetic code-switched test set.
arXiv Detail & Related papers (2021-09-29T11:15:00Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - WARP: Word-level Adversarial ReProgramming [13.08689221166729]
In many applications it is preferable to tune much smaller sets of parameters, so that the majority of parameters can be shared across multiple tasks.
We present an alternative approach based on adversarial reprogramming, which extends earlier work on automatic prompt generation.
We show that this approach outperforms other methods with a similar number of trainable parameters on SST-2 and MNLI datasets.
arXiv Detail & Related papers (2021-01-01T00:41:03Z) - Code Switching Language Model Using Monolingual Training Data [0.0]
Training a code-switching (CS) language model using only monolingual data is still an ongoing research problem.
In this work, an RNN language model is trained using alternate batches from only monolingual English and Spanish data.
Results were consistently improved using mean square error (MSE) in the output embeddings of RNN based language model.
arXiv Detail & Related papers (2020-12-23T08:56:39Z) - Cross-lingual Spoken Language Understanding with Regularized
Representation Alignment [71.53159402053392]
We propose a regularization approach to align word-level and sentence-level representations across languages without any external resource.
Experiments on the cross-lingual spoken language understanding task show that our model outperforms current state-of-the-art methods in both few-shot and zero-shot scenarios.
arXiv Detail & Related papers (2020-09-30T08:56:53Z) - GLUECoS : An Evaluation Benchmark for Code-Switched NLP [17.066725832825423]
We present an evaluation benchmark, GLUECoS, for code-switched languages.
We present results on several NLP tasks in English-Hindi and English-Spanish.
We fine-tune multilingual models on artificially generated code-switched data.
arXiv Detail & Related papers (2020-04-26T13:28:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.