GLUECoS : An Evaluation Benchmark for Code-Switched NLP
- URL: http://arxiv.org/abs/2004.12376v2
- Date: Thu, 14 May 2020 05:57:38 GMT
- Title: GLUECoS : An Evaluation Benchmark for Code-Switched NLP
- Authors: Simran Khanuja, Sandipan Dandapat, Anirudh Srinivasan, Sunayana
Sitaram, Monojit Choudhury
- Abstract summary: We present an evaluation benchmark, GLUECoS, for code-switched languages.
We present results on several NLP tasks in English-Hindi and English-Spanish.
We fine-tune multilingual models on artificially generated code-switched data.
- Score: 17.066725832825423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Code-switching is the use of more than one language in the same conversation
or utterance. Recently, multilingual contextual embedding models, trained on
multiple monolingual corpora, have shown promising results on cross-lingual and
multilingual tasks. We present an evaluation benchmark, GLUECoS, for
code-switched languages, that spans several NLP tasks in English-Hindi and
English-Spanish. Specifically, our evaluation benchmark includes Language
Identification from text, POS tagging, Named Entity Recognition, Sentiment
Analysis, Question Answering and a new task for code-switching, Natural
Language Inference. We present results on all these tasks using cross-lingual
word embedding models and multilingual models. In addition, we fine-tune
multilingual models on artificially generated code-switched data. Although
multilingual models perform significantly better than cross-lingual models, our
results show that in most tasks, across both language pairs, multilingual
models fine-tuned on code-switched data perform best, showing that multilingual
models can be further optimized for code-switching tasks.
Related papers
- Multi-lingual Evaluation of Code Generation Models [82.7357812992118]
We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X.
These datasets cover over 10 programming languages.
We are able to assess the performance of code generation models in a multi-lingual fashion.
arXiv Detail & Related papers (2022-10-26T17:17:06Z) - Call Larisa Ivanovna: Code-Switching Fools Multilingual NLU Models [1.827510863075184]
Novel benchmarks for multilingual natural language understanding (NLU) include monolingual sentences in several languages, annotated with intents and slots.
Existing benchmarks lack of code-switched utterances, which are difficult to gather and label due to complexity in the grammatical structure.
Our work adopts recognized methods to generate plausible and naturally-sounding code-switched utterances and uses them to create a synthetic code-switched test set.
arXiv Detail & Related papers (2021-09-29T11:15:00Z) - xGQA: Cross-Lingual Visual Question Answering [100.35229218735938]
xGQA is a new multilingual evaluation benchmark for the visual question answering task.
We extend the established English GQA dataset to 7 typologically diverse languages.
We propose new adapter-based approaches to adapt multimodal transformer-based models to become multilingual.
arXiv Detail & Related papers (2021-09-13T15:58:21Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
Cross-Lingual NLP [68.2650714613869]
We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT.
Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages.
arXiv Detail & Related papers (2020-06-11T13:15:59Z) - Learning to Scale Multilingual Representations for Vision-Language Tasks [51.27839182889422]
The effectiveness of SMALR is demonstrated with ten diverse languages, over twice the number supported in vision-language tasks to date.
We evaluate on multilingual image-sentence retrieval and outperform prior work by 3-4% with less than 1/5th the training parameters compared to other word embedding methods.
arXiv Detail & Related papers (2020-04-09T01:03:44Z) - XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks.
We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.