Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models
- URL: http://arxiv.org/abs/2406.04630v1
- Date: Fri, 7 Jun 2024 04:31:41 GMT
- Title: Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models
- Authors: Gyutae Park, Seojin Hwang, Hwanhee Lee,
- Abstract summary: Cross-lingual summarization (XLS) aims to generate a summary in a target language different from the source language document.
Large language models (LLMs) have shown promising zero-shot XLS performance, their few-shot capabilities on this task remain unexplored.
We investigate the few-shot XLS performance of various models, including Mistral-7B-Instruct-v0.2, GPT-3.5, and GPT-4.
- Score: 4.9325206373289125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-lingual summarization (XLS) aims to generate a summary in a target language different from the source language document. While large language models (LLMs) have shown promising zero-shot XLS performance, their few-shot capabilities on this task remain unexplored, especially for low-resource languages with limited parallel data. In this paper, we investigate the few-shot XLS performance of various models, including Mistral-7B-Instruct-v0.2, GPT-3.5, and GPT-4. Our experiments demonstrate that few-shot learning significantly improves the XLS performance of LLMs, particularly GPT-3.5 and GPT-4, in low-resource settings. However, the open-source model Mistral-7B-Instruct-v0.2 struggles to adapt effectively to the XLS task with limited examples. Our findings highlight the potential of few-shot learning for improving XLS performance and the need for further research in designing LLM architectures and pre-training objectives tailored for this task. We provide a future work direction to explore more effective few-shot learning strategies and to investigate the transfer learning capabilities of LLMs for cross-lingual summarization.
Related papers
- Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.
Currently, instruction-tuned large language models (LLMs) excel at various English tasks.
Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z) - TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale [66.01943465390548]
We introduce TriSum, a framework for distilling large language models' text summarization abilities into a compact, local model.
Our method enhances local model performance on various benchmarks.
It also improves interpretability by providing insights into the summarization rationale.
arXiv Detail & Related papers (2024-03-15T14:36:38Z) - Zero-Shot Cross-Lingual Reranking with Large Language Models for
Low-Resource Languages [51.301942056881146]
We investigate how large language models (LLMs) function as rerankers in cross-lingual information retrieval systems for African languages.
Our implementation covers English and four African languages (Hausa, Somali, Swahili, and Yoruba)
We examine cross-lingual reranking with queries in English and passages in the African languages.
arXiv Detail & Related papers (2023-12-26T18:38:54Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks [12.665447518524187]
This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs by comparing them on the same set of multilingual datasets.
Our benchmark comprises 22 datasets covering 83 languages, including low-resource African languages.
We also perform a study on data contamination and find that several models are likely to be contaminated with multilingual evaluation benchmarks.
arXiv Detail & Related papers (2023-11-13T16:45:37Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z) - Language Model Self-improvement by Reinforcement Learning Contemplation [13.152789365858812]
This paper introduces a novel unsupervised method called LanguageModel Self-Improvement by Reinforcement Learning Contemplation (SIRLC)
As a student, the model generates answers to unlabeled questions, while as a teacher, it evaluates the generated text and assigns scores accordingly.
We demonstrate that SIRLC can be applied to various NLP tasks, such as reasoning problems, text generation, and machine translation.
arXiv Detail & Related papers (2023-05-23T19:25:52Z) - Zero-Shot Cross-Lingual Summarization via Large Language Models [108.30673793281987]
Cross-lingual summarization ( CLS) generates a summary in a different target language.
Recent emergence of Large Language Models (LLMs) has attracted wide attention from the computational linguistics community.
In this report, we empirically use various prompts to guide LLMs to perform zero-shot CLS from different paradigms.
arXiv Detail & Related papers (2023-02-28T01:27:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.