Rank Your Summaries: Enhancing Bengali Text Summarization via
Ranking-based Approach
- URL: http://arxiv.org/abs/2307.07392v1
- Date: Fri, 14 Jul 2023 15:07:20 GMT
- Title: Rank Your Summaries: Enhancing Bengali Text Summarization via
Ranking-based Approach
- Authors: G. M. Shahariar, Tonmoy Talukder, Rafin Alam Khan Sotez, Md. Tanvir
Rouf Shawon
- Abstract summary: This paper aims to identify the most accurate and informative summary for a given text by utilizing a simple but effective ranking-based approach.
We utilize four pre-trained summarization models to generate summaries, followed by applying a text ranking algorithm to identify the most suitable summary.
Experimental results suggest that by leveraging the strengths of each pre-trained transformer model, our methodology significantly improves the accuracy and effectiveness of the Bengali text summarization.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With the increasing need for text summarization techniques that are both
efficient and accurate, it becomes crucial to explore avenues that enhance the
quality and precision of pre-trained models specifically tailored for
summarizing Bengali texts. When it comes to text summarization tasks, there are
numerous pre-trained transformer models at one's disposal. Consequently, it
becomes quite a challenge to discern the most informative and relevant summary
for a given text among the various options generated by these pre-trained
summarization models. This paper aims to identify the most accurate and
informative summary for a given text by utilizing a simple but effective
ranking-based approach that compares the output of four different pre-trained
Bengali text summarization models. The process begins by carrying out
preprocessing of the input text that involves eliminating unnecessary elements
such as special characters and punctuation marks. Next, we utilize four
pre-trained summarization models to generate summaries, followed by applying a
text ranking algorithm to identify the most suitable summary. Ultimately, the
summary with the highest ranking score is chosen as the final one. To evaluate
the effectiveness of this approach, the generated summaries are compared
against human-annotated summaries using standard NLG metrics such as BLEU,
ROUGE, BERTScore, WIL, WER, and METEOR. Experimental results suggest that by
leveraging the strengths of each pre-trained transformer model and combining
them using a ranking-based approach, our methodology significantly improves the
accuracy and effectiveness of the Bengali text summarization.
Related papers
- Assessment of Transformer-Based Encoder-Decoder Model for Human-Like Summarization [0.05852077003870416]
This work leverages transformer-based BART model for human-like summarization.
On training and fine-tuning the encoder-decoder model, it is tested with diverse sample articles.
The finetuned model performance is compared with the baseline pretrained model.
Empirical results on BBC News articles highlight that the gold standard summaries written by humans are more factually consistent by 17%.
arXiv Detail & Related papers (2024-10-22T09:25:04Z) - Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - AugSumm: towards generalizable speech summarization using synthetic
labels from large language model [61.73741195292997]
Abstractive speech summarization (SSUM) aims to generate human-like summaries from speech.
conventional SSUM models are mostly trained and evaluated with a single ground-truth (GT) human-annotated deterministic summary.
We propose AugSumm, a method to leverage large language models (LLMs) as a proxy for human annotators to generate augmented summaries.
arXiv Detail & Related papers (2024-01-10T18:39:46Z) - Revisiting text decomposition methods for NLI-based factuality scoring
of summaries [9.044665059626958]
We show that fine-grained decomposition is not always a winning strategy for factuality scoring.
We also show that small changes to previously proposed entailment-based scoring methods can result in better performance.
arXiv Detail & Related papers (2022-11-30T09:54:37Z) - COLO: A Contrastive Learning based Re-ranking Framework for One-Stage
Summarization [84.70895015194188]
We propose a Contrastive Learning based re-ranking framework for one-stage summarization called COLO.
COLO boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score.
arXiv Detail & Related papers (2022-09-29T06:11:21Z) - Comparing Methods for Extractive Summarization of Call Centre Dialogue [77.34726150561087]
We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively.
We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations.
arXiv Detail & Related papers (2022-09-06T13:16:02Z) - ARMAN: Pre-training with Semantically Selecting and Reordering of
Sentences for Persian Abstractive Summarization [7.16879432974126]
We propose ARMAN, a Transformer-based encoder-decoder model pre-trained with three novel objectives to address this issue.
In ARMAN, salient sentences from a document are selected according to a modified semantic score to be masked and form a pseudo summary.
We show that our proposed model achieves state-of-the-art performance on all six summarization tasks measured by ROUGE and BERTScore.
arXiv Detail & Related papers (2021-09-09T08:35:39Z) - Automated News Summarization Using Transformers [4.932130498861987]
We will be presenting a comprehensive comparison of a few transformer architecture based pre-trained models for text summarization.
For analysis and comparison, we have used the BBC news dataset that contains text data that can be used for summarization and human generated summaries.
arXiv Detail & Related papers (2021-04-23T04:22:33Z) - What Have We Achieved on Text Summarization? [32.90169694110989]
We aim to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level.
We quantify 8 major sources of errors on 10 representative summarization models manually.
arXiv Detail & Related papers (2020-10-09T12:39:33Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z) - Pre-training for Abstractive Document Summarization by Reinstating
Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text.
Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.