Boosting Zero-Shot Crosslingual Performance using LLM-Based Augmentations with Effective Data Selection
- URL: http://arxiv.org/abs/2407.10582v1
- Date: Mon, 15 Jul 2024 10:00:22 GMT
- Title: Boosting Zero-Shot Crosslingual Performance using LLM-Based Augmentations with Effective Data Selection
- Authors: Barah Fazili, Ashish Sunil Agrawal, Preethi Jyothi,
- Abstract summary: Large language models (LLMs) are very proficient text generators.
We leverage this capability to generate task-specific data via zero-shot prompting.
We observe significant performance gains across sentiment analysis and natural language inference tasks.
- Score: 23.575482348558904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) are very proficient text generators. We leverage this capability of LLMs to generate task-specific data via zero-shot prompting and promote cross-lingual transfer for low-resource target languages. Given task-specific data in a source language and a teacher model trained on this data, we propose using this teacher to label LLM generations and employ a set of simple data selection strategies that use the teacher's label probabilities. Our data selection strategies help us identify a representative subset of diverse generations that help boost zero-shot accuracies while being efficient, in comparison to using all the LLM generations (without any subset selection). We also highlight other important design choices that affect cross-lingual performance such as the use of translations of source data and what labels are best to use for the LLM generations. We observe significant performance gains across sentiment analysis and natural language inference tasks (of up to a maximum of 7.13 absolute points and 1.5 absolute points on average) across a number of target languages (Hindi, Marathi, Urdu, Swahili) and domains.
Related papers
- Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.
Currently, instruction-tuned large language models (LLMs) excel at various English tasks.
Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z) - UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised
Fine-tuning Dataset [69.33424532827608]
Open-source large language models (LLMs) have gained significant strength across diverse fields.
In this work, we construct an open-source multilingual supervised fine-tuning dataset.
The resulting UltraLink dataset comprises approximately 1 million samples across five languages.
arXiv Detail & Related papers (2024-02-07T05:05:53Z) - Language and Task Arithmetic with Parameter-Efficient Layers for Zero-Shot Summarization [126.96113831681338]
In this paper, we propose to improve zero-shot cross-lingual transfer by composing language or task specialized parameters.
Our method composes language and task PEFT modules via element-wise arithmetic operations to leverage unlabeled data and English labeled data.
arXiv Detail & Related papers (2023-11-15T20:04:58Z) - Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer [92.80671770992572]
Cross-lingual transfer is a central task in multilingual NLP.
Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data.
We propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer.
arXiv Detail & Related papers (2023-09-19T19:30:56Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - XeroAlign: Zero-Shot Cross-lingual Transformer Alignment [9.340611077939828]
We introduce a method for task-specific alignment of cross-lingual pretrained transformers such as XLM-R.
XeroAlign uses translated task data to encourage the model to generate similar sentence embeddings for different languages.
XLM-RA's text classification accuracy exceeds that of XLM-R trained with labelled data and performs on par with state-of-the-art models on a cross-lingual adversarial paraphrasing task.
arXiv Detail & Related papers (2021-05-06T07:10:00Z) - UXLA: A Robust Unsupervised Data Augmentation Framework for
Zero-Resource Cross-Lingual NLP [19.65783178853385]
We propose UXLA, a novel unsupervised data augmentation framework for zero-resource transfer learning scenarios.
In particular, UXLA aims to solve cross-lingual adaptation problems from a source language task distribution to an unknown target language task distribution.
At its core, UXLA performs simultaneous self-training with data augmentation and unsupervised sample selection.
arXiv Detail & Related papers (2020-04-28T01:47:37Z) - XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training,
Understanding and Generation [100.09099800591822]
XGLUE is a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models.
XGLUE provides 11 diversified tasks that cover both natural language understanding and generation scenarios.
arXiv Detail & Related papers (2020-04-03T07:03:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.