LLM Based Multi-Document Summarization Exploiting Main-Event Biased
Monotone Submodular Content Extraction
- URL: http://arxiv.org/abs/2310.03414v1
- Date: Thu, 5 Oct 2023 09:38:09 GMT
- Title: LLM Based Multi-Document Summarization Exploiting Main-Event Biased
Monotone Submodular Content Extraction
- Authors: Litton J Kurisinkel, Nancy F. Chen
- Abstract summary: Multi-document summarization is a challenging task due to its inherent subjective bias.
We aim to enhance the objectivity of news summarization by focusing on the main event of a group of related news documents.
- Score: 42.171703872560286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-document summarization is a challenging task due to its inherent
subjective bias, highlighted by the low inter-annotator ROUGE-1 score of 0.4
among DUC-2004 reference summaries. In this work, we aim to enhance the
objectivity of news summarization by focusing on the main event of a group of
related news documents and presenting it coherently with sufficient context.
Our primary objective is to succinctly report the main event, ensuring that the
summary remains objective and informative. To achieve this, we employ an
extract-rewrite approach that incorporates a main-event biased
monotone-submodular function for content selection. This enables us to extract
the most crucial information related to the main event from the document
cluster. To ensure coherence, we utilize a fine-tuned Language Model (LLM) for
rewriting the extracted content into a coherent text. The evaluation using
objective metrics and human evaluators confirms the effectiveness of our
approach, as it surpasses potential baselines, demonstrating excellence in both
content coverage, coherence, and informativeness.
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - GUMsley: Evaluating Entity Salience in Summarization for 12 English
Genres [14.37990666928991]
We present and evaluate GUMsley, the first entity salience dataset covering all named and non-named salient entities for 12 genres of English text.
We show that predicting or providing salient entities to several model architectures enhances performance and helps derive higher-quality summaries.
arXiv Detail & Related papers (2024-01-31T16:30:50Z) - Summary-Oriented Vision Modeling for Multimodal Abstractive
Summarization [63.320005222549646]
Multimodal abstractive summarization (MAS) aims to produce a concise summary given the multimodal data (text and vision)
We propose to improve the summary quality through summary-oriented visual features.
Experiments on 44 languages, covering mid-high, low-, and zero-resource scenarios, verify the effectiveness and superiority of the proposed approach.
arXiv Detail & Related papers (2022-12-15T09:05:26Z) - Evaluating and Improving Factuality in Multimodal Abstractive
Summarization [91.46015013816083]
We propose CLIPBERTScore to leverage the robustness and strong factuality detection performance between image-summary and document-summary.
We show that this simple combination of two metrics in the zero-shot achieves higher correlations than existing factuality metrics for document summarization.
Our analysis demonstrates the robustness and high correlation of CLIPBERTScore and its components on four factuality metric-evaluation benchmarks.
arXiv Detail & Related papers (2022-11-04T16:50:40Z) - Controlled Text Reduction [15.102190738450092]
We formalize textitControlled Text Reduction as a standalone task.
A model then needs to generate a coherent text that includes all and only the target information.
arXiv Detail & Related papers (2022-10-24T17:59:03Z) - Salience Allocation as Guidance for Abstractive Summarization [61.31826412150143]
We propose a novel summarization approach with a flexible and reliable salience guidance, namely SEASON (SaliencE Allocation as Guidance for Abstractive SummarizatiON)
SEASON utilizes the allocation of salience expectation to guide abstractive summarization and adapts well to articles in different abstractiveness.
arXiv Detail & Related papers (2022-10-22T02:13:44Z) - AgreeSum: Agreement-Oriented Multi-Document Summarization [3.4743618614284113]
Given a cluster of articles, the goal is to provide abstractive summaries that represent information common and faithful to all input articles.
We create a dataset for AgreeSum, and provide annotations on articlesummary entailment relations for a subset of the clusters in the dataset.
arXiv Detail & Related papers (2021-06-04T06:17:49Z) - Understanding the Extent to which Summarization Evaluation Metrics
Measure the Information Quality of Summaries [74.28810048824519]
We analyze the token alignments used by ROUGE and BERTScore to compare summaries.
We argue that their scores largely cannot be interpreted as measuring information overlap.
arXiv Detail & Related papers (2020-10-23T15:55:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.