SgSum: Transforming Multi-document Summarization into Sub-graph
Selection
- URL: http://arxiv.org/abs/2110.12645v1
- Date: Mon, 25 Oct 2021 05:12:10 GMT
- Title: SgSum: Transforming Multi-document Summarization into Sub-graph
Selection
- Authors: Moye Chen, Wei Li, Jiachen Liu, Xinyan Xiao, Hua Wu, Haifeng Wang
- Abstract summary: Most existing extractive multi-document summarization (MDS) methods score each sentence individually and extract salient sentences one by one to compose a summary.
We propose a novel MDS framework (SgSum) to formulate the MDS task as a sub-graph selection problem.
Our model can produce significantly more coherent and informative summaries compared with traditional MDS methods.
- Score: 27.40759123902261
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most of existing extractive multi-document summarization (MDS) methods score
each sentence individually and extract salient sentences one by one to compose
a summary, which have two main drawbacks: (1) neglecting both the intra and
cross-document relations between sentences; (2) neglecting the coherence and
conciseness of the whole summary. In this paper, we propose a novel MDS
framework (SgSum) to formulate the MDS task as a sub-graph selection problem,
in which source documents are regarded as a relation graph of sentences (e.g.,
similarity graph or discourse graph) and the candidate summaries are its
sub-graphs. Instead of selecting salient sentences, SgSum selects a salient
sub-graph from the relation graph as the summary. Comparing with traditional
methods, our method has two main advantages: (1) the relations between
sentences are captured by modeling both the graph structure of the whole
document set and the candidate sub-graphs; (2) directly outputs an integrate
summary in the form of sub-graph which is more informative and coherent.
Extensive experiments on MultiNews and DUC datasets show that our proposed
method brings substantial improvements over several strong baselines. Human
evaluation results also demonstrate that our model can produce significantly
more coherent and informative summaries compared with traditional MDS methods.
Moreover, the proposed architecture has strong transfer ability from single to
multi-document input, which can reduce the resource bottleneck in MDS tasks.
Our code and results are available at:
\url{https://github.com/PaddlePaddle/Research/tree/master/NLP/EMNLP2021-SgSum}.
Related papers
- Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text
Documents via Semantic-Oriented Hierarchical Graphs [79.0426838808629]
We propose TAT-DQA, i.e. to answer the question over a visually-rich table-text document.
Specifically, we propose a novel Doc2SoarGraph framework with enhanced discrete reasoning capability.
We conduct extensive experiments on TAT-DQA dataset, and the results show that our proposed framework outperforms the best baseline model by 17.73% and 16.91% in terms of Exact Match (EM) and F1 score respectively on the test set.
arXiv Detail & Related papers (2023-05-03T07:30:32Z) - Scientific Paper Extractive Summarization Enhanced by Citation Graphs [50.19266650000948]
We focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings.
Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework.
Motivated by this, we propose a Graph-based Supervised Summarization model (GSS) to achieve more accurate results on the task when large-scale labeled data are available.
arXiv Detail & Related papers (2022-12-08T11:53:12Z) - Reinforcing Semantic-Symmetry for Document Summarization [15.113768658584979]
Document summarization condenses a long document into a short version with salient information and accurate semantic descriptions.
This paper introduces a new textbfreinforcing stextbfemantic-textbfsymmetry learning textbfmodel is proposed for document summarization.
A series of experiments have been conducted on two wildly used benchmark datasets CNN/Daily Mail and BigPatent.
arXiv Detail & Related papers (2021-12-14T17:41:37Z) - An analysis of document graph construction methods for AMR summarization [2.055054374525828]
We present a novel dataset consisting of human-annotated alignments between the nodes of paired documents and summaries.
We apply these two forms of evaluation to prior work as well as a new method for node merging and show that our new method has significantly better performance than prior work.
arXiv Detail & Related papers (2021-11-27T22:12:50Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - Multiplex Graph Neural Network for Extractive Text Summarization [34.185093491514394]
Extractive text summarization aims at extracting the most representative sentences from a given document as its summary.
We propose a novel Multiplex Graph Convolutional Network (Multi-GCN) to jointly model different types of relationships among sentences and words.
Based on Multi-GCN, we propose a Multiplex Graph Summarization (Multi-GraS) model for extractive text summarization.
arXiv Detail & Related papers (2021-08-29T16:11:01Z) - BASS: Boosting Abstractive Summarization with Unified Semantic Graph [49.48925904426591]
BASS is a framework for Boosting Abstractive Summarization based on a unified Semantic graph.
A graph-based encoder-decoder model is proposed to improve both the document representation and summary generation process.
Empirical results show that the proposed architecture brings substantial improvements for both long-document and multi-document summarization tasks.
arXiv Detail & Related papers (2021-05-25T16:20:48Z) - SummPip: Unsupervised Multi-Document Summarization with Sentence Graph
Compression [61.97200991151141]
SummPip is an unsupervised method for multi-document summarization.
We convert the original documents to a sentence graph, taking both linguistic and deep representation into account.
We then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary.
arXiv Detail & Related papers (2020-07-17T13:01:15Z) - Leveraging Graph to Improve Abstractive Multi-Document Summarization [50.62418656177642]
We develop a neural abstractive multi-document summarization (MDS) model which can leverage well-known graph representations of documents.
Our model utilizes graphs to encode documents in order to capture cross-document relations, which is crucial to summarizing long documents.
Our model can also take advantage of graphs to guide the summary generation process, which is beneficial for generating coherent and concise summaries.
arXiv Detail & Related papers (2020-05-20T13:39:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.