Resources and Few-shot Learners for In-context Learning in Slavic
Languages
- URL: http://arxiv.org/abs/2304.01922v1
- Date: Tue, 4 Apr 2023 16:16:25 GMT
- Title: Resources and Few-shot Learners for In-context Learning in Slavic
Languages
- Authors: Michal \v{S}tef\'anik and Marek Kadl\v{c}\'ik and Piotr Gramacki and
Petr Sojka
- Abstract summary: We collect the infrastructure necessary for training and evaluation of in-context learning (ICL) in Slavic languages.
We evaluate a set of the most recent in-context learners and compare their results to the supervised baselines.
We find that ICL models tuned in English are also able to learn some tasks from non-English contexts.
- Score: 0.22940141855172028
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the rapid recent progress in creating accurate and compact in-context
learners, most recent work focuses on in-context learning (ICL) for tasks in
English. However, the ability to interact with users of languages outside
English presents a great potential for broadening the applicability of language
technologies to non-English speakers.
In this work, we collect the infrastructure necessary for training and
evaluation of ICL in a selection of Slavic languages: Czech, Polish, and
Russian. We link a diverse set of datasets and cast these into a unified
instructional format through a set of transformations and newly-crafted
templates written purely in target languages. Using the newly-curated dataset,
we evaluate a set of the most recent in-context learners and compare their
results to the supervised baselines. Finally, we train, evaluate and publish a
set of in-context learning models that we train on the collected resources and
compare their performance to previous work.
We find that ICL models tuned in English are also able to learn some tasks
from non-English contexts, but multilingual instruction fine-tuning
consistently improves the ICL ability. We also find that the massive multitask
training can be outperformed by single-task training in the target language,
uncovering the potential for specializing in-context learners to the
language(s) of their application.
Related papers
- Concept-aware Data Construction Improves In-context Learning of Language Models [2.4715271879679395]
We show that concept-aware in-context learning is more effective for a majority of new tasks when compared to traditional instruction tuning.
We propose Concept-aware Training (CoAT), a framework for constructing training scenarios that make it beneficial for the LM to learn to utilize the analogical reasoning concepts from demonstrations.
arXiv Detail & Related papers (2024-03-08T19:07:47Z) - From Classification to Generation: Insights into Crosslingual Retrieval
Augmented ICL [8.065775937617417]
We introduce a novel approach that leverages cross-lingual retrieval-augmented in-context learning (CREA-ICL)
By extracting semantically similar prompts from high-resource languages, we aim to improve the zero-shot performance of multilingual pre-trained language models (MPLMs)
Though our approach yields steady improvements in classification tasks, it faces challenges in generation tasks.
arXiv Detail & Related papers (2023-11-11T15:40:21Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - Pre-Training to Learn in Context [138.0745138788142]
The ability of in-context learning is not fully exploited because language models are not explicitly trained to learn in context.
We propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models' in-context learning ability.
Our experiments show that PICL is more effective and task-generalizable than a range of baselines, outperforming larger language models with nearly 4x parameters.
arXiv Detail & Related papers (2023-05-16T03:38:06Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark.
IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages.
We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z) - Cross-lingual Transferring of Pre-trained Contextualized Language Models [73.97131976850424]
We propose a novel cross-lingual model transferring framework for PrLMs: TreLM.
To handle the symbol order and sequence length differences between languages, we propose an intermediate TRILayer" structure.
We show the proposed framework significantly outperforms language models trained from scratch with limited data in both performance and efficiency.
arXiv Detail & Related papers (2021-07-27T06:51:13Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - Zero-Shot Cross-Lingual Transfer with Meta Learning [45.29398184889296]
We consider the setting of training models on multiple languages at the same time, when little or no data is available for languages other than English.
We show that this challenging setup can be approached using meta-learning.
We experiment using standard supervised, zero-shot cross-lingual, as well as few-shot cross-lingual settings for different natural language understanding tasks.
arXiv Detail & Related papers (2020-03-05T16:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.