Abstractive Summarization Guided by Latent Hierarchical Document
Structure
- URL: http://arxiv.org/abs/2211.09458v1
- Date: Thu, 17 Nov 2022 11:02:30 GMT
- Title: Abstractive Summarization Guided by Latent Hierarchical Document
Structure
- Authors: Yifu Qiu, Shay B. Cohen
- Abstract summary: Sequential abstractive neural summarizers often do not use the underlying structure in the input article or dependencies between the input sentences.
We propose a hierarchy-aware graph neural network (HierGNN) which captures such dependencies through three main steps.
Experiments confirm HierGNN improves strong sequence models such as BART, with a 0.55 and 0.75 margin in average ROUGE-1/2/L for CNN/DM and XSum.
- Score: 28.284926421845533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sequential abstractive neural summarizers often do not use the underlying
structure in the input article or dependencies between the input sentences.
This structure is essential to integrate and consolidate information from
different parts of the text. To address this shortcoming, we propose a
hierarchy-aware graph neural network (HierGNN) which captures such dependencies
through three main steps: 1) learning a hierarchical document structure through
a latent structure tree learned by a sparse matrix-tree computation; 2)
propagating sentence information over this structure using a novel
message-passing node propagation mechanism to identify salient information; 3)
using graph-level attention to concentrate the decoder on salient information.
Experiments confirm HierGNN improves strong sequence models such as BART, with
a 0.55 and 0.75 margin in average ROUGE-1/2/L for CNN/DM and XSum. Further
human evaluation demonstrates that summaries produced by our model are more
relevant and less redundant than the baselines, into which HierGNN is
incorporated. We also find HierGNN synthesizes summaries by fusing multiple
source sentences more, rather than compressing a single source sentence, and
that it processes long inputs more effectively.
Related papers
- Learning to Model Graph Structural Information on MLPs via Graph Structure Self-Contrasting [50.181824673039436]
We propose a Graph Structure Self-Contrasting (GSSC) framework that learns graph structural information without message passing.
The proposed framework is based purely on Multi-Layer Perceptrons (MLPs), where the structural information is only implicitly incorporated as prior knowledge.
It first applies structural sparsification to remove potentially uninformative or noisy edges in the neighborhood, and then performs structural self-contrasting in the sparsified neighborhood to learn robust node representations.
arXiv Detail & Related papers (2024-09-09T12:56:02Z) - Hierarchical Attention Graph for Scientific Document Summarization in Global and Local Level [3.7651378994837104]
Long input hinders simultaneous modeling of both global high-order relations between sentences and local intra-sentence relations.
We propose HAESum, a novel approach utilizing graph neural networks to model documents based on their hierarchical discourse structure.
We validate our approach on two benchmark datasets, and the experimental results demonstrate the effectiveness of HAESum.
arXiv Detail & Related papers (2024-05-16T15:46:30Z) - Prompt Based Tri-Channel Graph Convolution Neural Network for Aspect
Sentiment Triplet Extraction [63.0205418944714]
Aspect Sentiment Triplet Extraction (ASTE) is an emerging task to extract a given sentence's triplets, which consist of aspects, opinions, and sentiments.
Recent studies tend to address this task with a table-filling paradigm, wherein word relations are encoded in a two-dimensional table.
We propose a novel model for the ASTE task, called Prompt-based Tri-Channel Graph Convolution Neural Network (PT-GCN), which converts the relation table into a graph to explore more comprehensive relational information.
arXiv Detail & Related papers (2023-12-18T12:46:09Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - TextRGNN: Residual Graph Neural Networks for Text Classification [13.912147013558846]
TextRGNN is an improved GNN structure that introduces residual connection to deepen the convolution network depth.
Our structure can obtain a wider node receptive field and effectively suppress the over-smoothing of node features.
It can significantly improve the classification accuracy whether in corpus level or text level, and achieve SOTA performance on a wide range of text classification datasets.
arXiv Detail & Related papers (2021-12-30T13:48:58Z) - Sparse Structure Learning via Graph Neural Networks for Inductive
Document Classification [2.064612766965483]
We propose a novel GNN-based sparse structure learning model for inductive document classification.
Our model collects a set of trainable edges connecting disjoint words between sentences and employs structure learning to sparsely select edges with dynamic contextual dependencies.
Experiments on several real-world datasets demonstrate that the proposed model outperforms most state-of-the-art results.
arXiv Detail & Related papers (2021-12-13T02:36:04Z) - Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text
Generation [56.73834525802723]
Lightweight Dynamic Graph Convolutional Networks (LDGCNs) are proposed.
LDGCNs capture richer non-local interactions by synthesizing higher order information from the input graphs.
We develop two novel parameter saving strategies based on the group graph convolutions and weight tied convolutions to reduce memory usage and model complexity.
arXiv Detail & Related papers (2020-10-09T06:03:46Z) - Heterogeneous Graph Neural Networks for Extractive Document
Summarization [101.17980994606836]
Cross-sentence relations are a crucial step in extractive document summarization.
We present a graph-based neural network for extractive summarization (HeterSumGraph)
We introduce different types of nodes into graph-based neural networks for extractive document summarization.
arXiv Detail & Related papers (2020-04-26T14:38:11Z) - Improved Code Summarization via a Graph Neural Network [96.03715569092523]
In general, source code summarization techniques use the source code as input and outputs a natural language description.
We present an approach that uses a graph-based neural architecture that better matches the default structure of the AST to generate these summaries.
arXiv Detail & Related papers (2020-04-06T17:36:42Z) - Selective Attention Encoders by Syntactic Graph Convolutional Networks
for Document Summarization [21.351111598564987]
We propose a graph to connect the parsing trees from the sentences in a document and utilize the stacked graph convolutional networks (GCNs) to learn the syntactic representation for a document.
The proposed GCNs based selective attention approach outperforms the baselines and achieves the state-of-the-art performance on the dataset.
arXiv Detail & Related papers (2020-03-18T01:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.