A Survey on Cross-Lingual Summarization
- URL: http://arxiv.org/abs/2203.12515v1
- Date: Wed, 23 Mar 2022 16:24:21 GMT
- Title: A Survey on Cross-Lingual Summarization
- Authors: Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng
Qu, Jie Zhou
- Abstract summary: Cross-lingual summarization is the task of generating a summary in one language for a document in a different language.
Under the globalization background, this task has attracted increasing attention from the computational linguistics community.
We present the first systematic critical review on the datasets, approaches and challenges in this field.
- Score: 43.89340385650822
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-lingual summarization is the task of generating a summary in one
language (e.g., English) for the given document(s) in a different language
(e.g., Chinese). Under the globalization background, this task has attracted
increasing attention of the computational linguistics community. Nevertheless,
there still remains a lack of comprehensive review for this task. Therefore, we
present the first systematic critical review on the datasets, approaches and
challenges in this field. Specifically, we carefully organize existing datasets
and approaches according to different construction methods and solution
paradigms, respectively. For each type of datasets or approaches, we thoroughly
introduce and summarize previous efforts and further compare them with each
other to provide deeper analyses. In the end, we also discuss promising
directions and offer our thoughts to facilitate future research. This survey is
for both beginners and experts in cross-lingual summarization, and we hope it
will serve as a starting point as well as a source of new ideas for researchers
and engineers interested in this area.
Related papers
- Multi-Target Cross-Lingual Summarization: a novel task and a language-neutral approach [3.5190489716607436]
Cross-lingual summarization aims to bridge language barriers by summarizing documents in different languages.
We introduce multi-target cross-lingual summarization as the task of summarizing a document into multiple target languages while ensuring that the produced summaries are semantically similar.
arXiv Detail & Related papers (2024-10-01T08:33:57Z) - Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models.
We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z) - A Study on Scaling Up Multilingual News Framing Analysis [23.80807884935475]
This study explores the possibility of dataset creation through crowdsourcing.
We first extend framing analysis beyond English news to a multilingual context.
We also present a novel benchmark in Bengali and Portuguese on the immigration and same-sex marriage domains.
arXiv Detail & Related papers (2024-04-01T21:02:18Z) - Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey [17.19337964440007]
There is currently a lack of comprehensive review that summarizes and compares the key techniques, metrics, datasets, models, and optimization approaches in this research domain.
This survey aims to address this gap by consolidating recent progress in these areas, offering a thorough survey and taxonomy of the datasets, metrics, and methodologies utilized.
It identifies strengths, limitations, unexplored territories, and gaps in the existing literature, while providing some insights for future research directions in this vital and rapidly evolving field.
arXiv Detail & Related papers (2024-02-27T23:59:01Z) - Cross-lingual Offensive Language Detection: A Systematic Review of
Datasets, Transfer Approaches and Challenges [10.079109184645478]
This survey presents a systematic and comprehensive exploration of Cross-Lingual Transfer Learning techniques in offensive language detection in social media.
Our study stands as the first holistic overview to focus exclusively on the cross-lingual scenario in this domain.
arXiv Detail & Related papers (2024-01-17T14:44:27Z) - Federated Learning for Generalization, Robustness, Fairness: A Survey
and Benchmark [55.898771405172155]
Federated learning has emerged as a promising paradigm for privacy-preserving collaboration among different parties.
We provide a systematic overview of the important and recent developments of research on federated learning.
arXiv Detail & Related papers (2023-11-12T06:32:30Z) - Recent Advances in Direct Speech-to-text Translation [58.692782919570845]
We categorize the existing research work into three directions based on the main challenges -- modeling burden, data scarcity, and application issues.
For the challenge of data scarcity, recent work resorts to many sophisticated techniques, such as data augmentation, pre-training, knowledge distillation, and multilingual modeling.
We analyze and summarize the application issues, which include real-time, segmentation, named entity, gender bias, and code-switching.
arXiv Detail & Related papers (2023-06-20T16:14:27Z) - Multilingual Multimodality: A Taxonomical Survey of Datasets,
Techniques, Challenges and Opportunities [10.721189858694396]
We study the unification of multilingual and multimodal (MultiX) streams.
We review the languages studied, gold or silver data with parallel annotations, and understand how these modalities and languages interact in modeling.
We present an account of the modeling approaches along with their strengths and weaknesses to better understand what scenarios they can be used reliably.
arXiv Detail & Related papers (2022-10-30T21:46:01Z) - Positioning yourself in the maze of Neural Text Generation: A
Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc.
We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.