Cross-Lingual Abstractive Summarization with Limited Parallel Resources
- URL: http://arxiv.org/abs/2105.13648v2
- Date: Mon, 31 May 2021 03:26:58 GMT
- Title: Cross-Lingual Abstractive Summarization with Limited Parallel Resources
- Authors: Yu Bai, Yang Gao, Heyan Huang
- Abstract summary: We propose a novel Multi-Task framework for Cross-Lingual Abstractive Summarization (MCLAS) in a low-resource setting.
Employing one unified decoder to generate the sequential concatenation of monolingual and cross-lingual summaries, MCLAS makes the monolingual summarization task a prerequisite of the cross-lingual summarization task.
Our model significantly outperforms three baseline models in both low-resource and full-dataset scenarios.
- Score: 22.680714603332355
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parallel cross-lingual summarization data is scarce, requiring models to
better use the limited available cross-lingual resources. Existing methods to
do so often adopt sequence-to-sequence networks with multi-task frameworks.
Such approaches apply multiple decoders, each of which is utilized for a
specific task. However, these independent decoders share no parameters, hence
fail to capture the relationships between the discrete phrases of summaries in
different languages, breaking the connections in order to transfer the
knowledge of the high-resource languages to low-resource languages. To bridge
these connections, we propose a novel Multi-Task framework for Cross-Lingual
Abstractive Summarization (MCLAS) in a low-resource setting. Employing one
unified decoder to generate the sequential concatenation of monolingual and
cross-lingual summaries, MCLAS makes the monolingual summarization task a
prerequisite of the cross-lingual summarization (CLS) task. In this way, the
shared decoder learns interactions involving alignments and summary patterns
across languages, which encourages attaining knowledge transfer. Experiments on
two CLS datasets demonstrate that our model significantly outperforms three
baseline models in both low-resource and full-dataset scenarios. Moreover,
in-depth analysis on the generated summaries and attention heads verifies that
interactions are learned well using MCLAS, which benefits the CLS task under
limited parallel resources.
Related papers
- LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models [89.13128402847943]
We present LUSIFER, a novel zero-shot approach that adapts LLM-based embedding models for multilingual tasks without requiring multilingual supervision.
LUSIFER's architecture combines a multilingual encoder, serving as a language-universal learner, with an LLM-based embedding model optimized for embedding-specific tasks.
We introduce a new benchmark encompassing 5 primary embedding tasks, 123 diverse datasets, and coverage across 14 languages.
arXiv Detail & Related papers (2025-01-01T15:43:07Z) - Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.
Currently, instruction-tuned large language models (LLMs) excel at various English tasks.
Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z) - Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT)
We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training.
Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z) - TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale [66.01943465390548]
We introduce TriSum, a framework for distilling large language models' text summarization abilities into a compact, local model.
Our method enhances local model performance on various benchmarks.
It also improves interpretability by providing insights into the summarization rationale.
arXiv Detail & Related papers (2024-03-15T14:36:38Z) - A Variational Hierarchical Model for Neural Cross-Lingual Summarization [85.44969140204026]
Cross-lingual summarization () is to convert a document in one language to a summary in another one.
Existing studies on CLS mainly focus on utilizing pipeline methods or jointly training an end-to-end model.
We propose a hierarchical model for the CLS task, based on the conditional variational auto-encoder.
arXiv Detail & Related papers (2022-03-08T02:46:11Z) - Improving Low-resource Reading Comprehension via Cross-lingual
Transposition Rethinking [0.9236074230806579]
Extractive Reading (ERC) has made tremendous advances enabled by the availability of large-scale high-quality ERC training data.
Despite of such rapid progress and widespread application, the datasets in languages other than high-resource languages such as English remain scarce.
We propose a Cross-Lingual Transposition ReThinking (XLTT) model by modelling existing high-quality extractive reading comprehension datasets in a multilingual environment.
arXiv Detail & Related papers (2021-07-11T09:35:16Z) - CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
Cross-Lingual NLP [68.2650714613869]
We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT.
Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages.
arXiv Detail & Related papers (2020-06-11T13:15:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.