Supervising the Centroid Baseline for Extractive Multi-Document
Summarization
- URL: http://arxiv.org/abs/2311.17771v1
- Date: Wed, 29 Nov 2023 16:11:45 GMT
- Title: Supervising the Centroid Baseline for Extractive Multi-Document
Summarization
- Authors: Sim\~ao Gon\c{c}alves, Gon\c{c}alo Correia, Diogo Pernes, Afonso
Mendes
- Abstract summary: The centroid method is a simple approach for extractive multi-document summarization.
We refine it by adding a beam search process to the sentence selection and also a centroid estimation attention model that leads to improved results.
- Score: 2.0306707203430348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The centroid method is a simple approach for extractive multi-document
summarization and many improvements to its pipeline have been proposed. We
further refine it by adding a beam search process to the sentence selection and
also a centroid estimation attention model that leads to improved results. We
demonstrate this in several multi-document summarization datasets, including in
a multilingual scenario.
Related papers
- Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation [60.493180081319785]
We propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step.
Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
arXiv Detail & Related papers (2024-08-24T14:14:32Z) - Unsupervised Multi-document Summarization with Holistic Inference [41.58777650517525]
This paper proposes a new holistic framework for unsupervised multi-document extractive summarization.
Subset Representative Index (SRI) balances the importance and diversity of a subset of sentences from the source documents.
Our findings suggest that diversity is essential for improving multi-document summary performance.
arXiv Detail & Related papers (2023-09-08T02:56:30Z) - UniSumm and SummZoo: Unified Model and Diverse Benchmark for Few-Shot
Summarization [54.59104881168188]
textscUniSumm is a unified few-shot summarization model pre-trained with multiple summarization tasks.
textscSummZoo is a new benchmark to better evaluate few-shot summarizers.
arXiv Detail & Related papers (2022-11-17T18:54:47Z) - ACM -- Attribute Conditioning for Abstractive Multi Document
Summarization [0.0]
We propose a model that incorporates attribute conditioning modules in order to decouple conflicting information by conditioning for a certain attribute in the output summary.
This approach shows strong gains in ROUGE score over baseline multi document summarization approaches.
arXiv Detail & Related papers (2022-05-09T00:00:14Z) - Large-Scale Multi-Document Summarization with Information Extraction and
Compression [31.601707033466766]
We develop an abstractive summarization framework independent of labeled data for multiple heterogeneous documents.
Our framework processes documents telling different stories instead of documents on the same topic.
Our experiments demonstrate that our framework outperforms current state-of-the-art methods in this more generic setting.
arXiv Detail & Related papers (2022-05-01T19:49:15Z) - Unsupervised Summarization with Customized Granularities [76.26899748972423]
We propose the first unsupervised multi-granularity summarization framework, GranuSum.
By inputting different numbers of events, GranuSum is capable of producing multi-granular summaries in an unsupervised manner.
arXiv Detail & Related papers (2022-01-29T05:56:35Z) - An Enhanced MeanSum Method For Generating Hotel Multi-Review
Summarizations [0.06091702876917279]
This work uses Multi-Aspect Masker(MAM) as content selector to address the issue with multi-aspect.
We also propose a regularizer to control the length of the generated summaries.
Our improved model achieves higher ROUGE, Sentiment Accuracy than the original Meansum method.
arXiv Detail & Related papers (2020-12-07T13:16:01Z) - SupMMD: A Sentence Importance Model for Extractive Summarization using
Maximum Mean Discrepancy [92.5683788430012]
SupMMD is a novel technique for generic and update summarization based on the maximum discrepancy from kernel two-sample testing.
We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets.
arXiv Detail & Related papers (2020-10-06T09:26:55Z) - SummPip: Unsupervised Multi-Document Summarization with Sentence Graph
Compression [61.97200991151141]
SummPip is an unsupervised method for multi-document summarization.
We convert the original documents to a sentence graph, taking both linguistic and deep representation into account.
We then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary.
arXiv Detail & Related papers (2020-07-17T13:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.