LLM-powered Data Augmentation for Enhanced Cross-lingual Performance
- URL: http://arxiv.org/abs/2305.14288v2
- Date: Sun, 22 Oct 2023 22:57:00 GMT
- Title: LLM-powered Data Augmentation for Enhanced Cross-lingual Performance
- Authors: Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Abstract summary: This paper explores the potential of leveraging Large Language Models (LLMs) for data augmentation in commonsense reasoning datasets.
To achieve this, we utilise several LLMs, namely Dolly-v2, StableVicuna, ChatGPT, and GPT-4, to augment three datasets: XCOPA, XWinograd, and XStoryCloze.
We evaluate the effectiveness of fine-tuning smaller multilingual models, mBERT and XLMR, using the synthesised data.
- Score: 24.20730298894794
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper explores the potential of leveraging Large Language Models (LLMs)
for data augmentation in multilingual commonsense reasoning datasets where the
available training data is extremely limited. To achieve this, we utilise
several LLMs, namely Dolly-v2, StableVicuna, ChatGPT, and GPT-4, to augment
three datasets: XCOPA, XWinograd, and XStoryCloze. Subsequently, we evaluate
the effectiveness of fine-tuning smaller multilingual models, mBERT and XLMR,
using the synthesised data. We compare the performance of training with data
generated in English and target languages, as well as translated
English-generated data, revealing the overall advantages of incorporating data
generated by LLMs, e.g. a notable 13.4 accuracy score improvement for the best
case. Furthermore, we conduct a human evaluation by asking native speakers to
assess the naturalness and logical coherence of the generated examples across
different languages. The results of the evaluation indicate that LLMs such as
ChatGPT and GPT-4 excel at producing natural and coherent text in most
languages, however, they struggle to generate meaningful text in certain
languages like Tamil. We also observe that ChatGPT falls short in generating
plausible alternatives compared to the original dataset, whereas examples from
GPT-4 exhibit competitive logical consistency.
Related papers
- Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.
Currently, instruction-tuned large language models (LLMs) excel at various English tasks.
Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z) - sPhinX: Sample Efficient Multilingual Instruction Fine-Tuning Through N-shot Guided Prompting [29.63634707674839]
We introduce a novel recipe for creating a multilingual synthetic instruction tuning dataset, sPhinX.
sPhinX is created by selectively translating instruction response pairs from English into 50 languages.
We test the effectiveness of sPhinx by using it to fine-tune two state-of-the-art models, Mistral-7B and Phi-Small.
arXiv Detail & Related papers (2024-07-13T13:03:45Z) - CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models [36.82189550072201]
Existing text-to-table datasets are typically oriented English.
Large language models (LLMs) have shown great success as general task solvers in multi-lingual settings.
We propose a Chinese text-to-table dataset, CT-Eval, to benchmark LLMs on this task.
arXiv Detail & Related papers (2024-05-20T16:58:02Z) - Zero-Shot Cross-Lingual Reranking with Large Language Models for
Low-Resource Languages [51.301942056881146]
We investigate how large language models (LLMs) function as rerankers in cross-lingual information retrieval systems for African languages.
Our implementation covers English and four African languages (Hausa, Somali, Swahili, and Yoruba)
We examine cross-lingual reranking with queries in English and passages in the African languages.
arXiv Detail & Related papers (2023-12-26T18:38:54Z) - MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks [12.665447518524187]
This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs by comparing them on the same set of multilingual datasets.
Our benchmark comprises 22 datasets covering 83 languages, including low-resource African languages.
We also perform a study on data contamination and find that several models are likely to be contaminated with multilingual evaluation benchmarks.
arXiv Detail & Related papers (2023-11-13T16:45:37Z) - Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations [59.056367787688146]
This paper pioneers exploring and training powerful Multilingual Math Reasoning (xMR) LLMs.
We construct the first multilingual math reasoning instruction dataset, MGSM8KInstruct, encompassing ten distinct languages.
By utilizing translation, we construct the first multilingual math reasoning instruction dataset, MGSM8KInstruct, encompassing ten distinct languages.
arXiv Detail & Related papers (2023-10-31T08:09:20Z) - Improving Domain-Specific Retrieval by NLI Fine-Tuning [64.79760042717822]
This article investigates the fine-tuning potential of natural language inference (NLI) data to improve information retrieval and ranking.
We employ both monolingual and multilingual sentence encoders fine-tuned by a supervised method utilizing contrastive loss and NLI data.
Our results point to the fact that NLI fine-tuning increases the performance of the models in both tasks and both languages, with the potential to improve mono- and multilingual models.
arXiv Detail & Related papers (2023-08-06T12:40:58Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z) - Improving Polish to English Neural Machine Translation with Transfer
Learning: Effects of Data Volume and Language Similarity [2.4674086273775035]
We investigate the impact of data volume and the use of similar languages on transfer learning in a machine translation task.
We fine-tune mBART model for a Polish-English translation task using the OPUS-100 dataset.
Our experiments show that a combination of related languages and larger amounts of data outperforms the model trained on related languages or larger amounts of data alone.
arXiv Detail & Related papers (2023-06-01T13:34:21Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - Improving Low-resource Reading Comprehension via Cross-lingual
Transposition Rethinking [0.9236074230806579]
Extractive Reading (ERC) has made tremendous advances enabled by the availability of large-scale high-quality ERC training data.
Despite of such rapid progress and widespread application, the datasets in languages other than high-resource languages such as English remain scarce.
We propose a Cross-Lingual Transposition ReThinking (XLTT) model by modelling existing high-quality extractive reading comprehension datasets in a multilingual environment.
arXiv Detail & Related papers (2021-07-11T09:35:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.