A Variational Hierarchical Model for Neural Cross-Lingual Summarization
- URL: http://arxiv.org/abs/2203.03820v1
- Date: Tue, 8 Mar 2022 02:46:11 GMT
- Title: A Variational Hierarchical Model for Neural Cross-Lingual Summarization
- Authors: Yunlong Liang, Fandong Meng, Chulun Zhou, Jinan Xu, Yufeng Chen,
Jinsong Su and Jie Zhou
- Abstract summary: Cross-lingual summarization () is to convert a document in one language to a summary in another one.
Existing studies on CLS mainly focus on utilizing pipeline methods or jointly training an end-to-end model.
We propose a hierarchical model for the CLS task, based on the conditional variational auto-encoder.
- Score: 85.44969140204026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The goal of the cross-lingual summarization (CLS) is to convert a document in
one language (e.g., English) to a summary in another one (e.g., Chinese).
Essentially, the CLS task is the combination of machine translation (MT) and
monolingual summarization (MS), and thus there exists the hierarchical
relationship between MT\&MS and CLS. Existing studies on CLS mainly focus on
utilizing pipeline methods or jointly training an end-to-end model through an
auxiliary MT or MS objective. However, it is very challenging for the model to
directly conduct CLS as it requires both the abilities to translate and
summarize. To address this issue, we propose a hierarchical model for the CLS
task, based on the conditional variational auto-encoder. The hierarchical model
contains two kinds of latent variables at the local and global levels,
respectively. At the local level, there are two latent variables, one for
translation and the other for summarization. As for the global level, there is
another latent variable for cross-lingual summarization conditioned on the two
local-level variables. Experiments on two language directions (English-Chinese)
verify the effectiveness and superiority of the proposed approach. In addition,
we show that our model is able to generate better cross-lingual summaries than
comparison models in the few-shot setting.
Related papers
- Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model [14.92282077647913]
Continual fine-tuning (CFT) is the process of sequentially fine-tuning an LLM to enable the model to adapt to downstream tasks.
We study a two-phase CFT process in which an English-only end-to-end fine-tuned LLM is sequentially fine-tuned on a multilingual dataset.
We observe that the similarity'' of Phase 2 tasks with Phase 1 determines the LLM's adaptability.
arXiv Detail & Related papers (2024-10-21T13:39:03Z) - Mitigating Data Imbalance and Representation Degeneration in
Multilingual Machine Translation [103.90963418039473]
Bi-ACL is a framework that uses only target-side monolingual data and a bilingual dictionary to improve the performance of the MNMT model.
We show that Bi-ACL is more effective both in long-tail languages and in high-resource languages.
arXiv Detail & Related papers (2023-05-22T07:31:08Z) - Towards Unifying Multi-Lingual and Cross-Lingual Summarization [43.89340385650822]
We aim to unify multilingual summarization (MLS) and cross-lingual summarization ( CLS) into a more general setting, i.e., many-to-many summarization (M2MS)
As the first step towards M2MS, we conduct preliminary studies to show that M2MS can better transfer task knowledge across different languages than MLS and CLS.
We propose Pisces, a pre-trained M2MS model that learns language modeling, cross-lingual ability and summarization ability via three-stage pre-training.
arXiv Detail & Related papers (2023-05-16T06:53:21Z) - Understanding Translationese in Cross-Lingual Summarization [106.69566000567598]
Cross-lingual summarization (MS) aims at generating a concise summary in a different target language.
To collect large-scale CLS data, existing datasets typically involve translation in their creation.
In this paper, we first confirm that different approaches of constructing CLS datasets will lead to different degrees of translationese.
arXiv Detail & Related papers (2022-12-14T13:41:49Z) - A Multi-level Supervised Contrastive Learning Framework for Low-Resource
Natural Language Inference [54.678516076366506]
Natural Language Inference (NLI) is a growingly essential task in natural language understanding.
Here we propose a multi-level supervised contrastive learning framework named MultiSCL for low-resource natural language inference.
arXiv Detail & Related papers (2022-05-31T05:54:18Z) - Examining Scaling and Transfer of Language Model Architectures for
Machine Translation [51.69212730675345]
Language models (LMs) process sequences in a single stack of layers, and encoder-decoder models (EncDec) utilize separate layer stacks for input and output processing.
In machine translation, EncDec has long been the favoured approach, but with few studies investigating the performance of LMs.
arXiv Detail & Related papers (2022-02-01T16:20:15Z) - Cross-Lingual Abstractive Summarization with Limited Parallel Resources [22.680714603332355]
We propose a novel Multi-Task framework for Cross-Lingual Abstractive Summarization (MCLAS) in a low-resource setting.
Employing one unified decoder to generate the sequential concatenation of monolingual and cross-lingual summaries, MCLAS makes the monolingual summarization task a prerequisite of the cross-lingual summarization task.
Our model significantly outperforms three baseline models in both low-resource and full-dataset scenarios.
arXiv Detail & Related papers (2021-05-28T07:51:42Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.