Neural Abstractive Summarization with Structural Attention
- URL: http://arxiv.org/abs/2004.09739v2
- Date: Sat, 10 Oct 2020 05:32:59 GMT
- Title: Neural Abstractive Summarization with Structural Attention
- Authors: Tanya Chowdhury, Sachin Kumar, Tanmoy Chakraborty
- Abstract summary: We present a hierarchical encoder based on structural attention to model such inter-sentence and inter-document dependencies.
We show that our proposed model achieves significant improvement over the baselines in both single and multi-document summarization settings.
- Score: 31.50918718905953
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attentional, RNN-based encoder-decoder architectures have achieved impressive
performance on abstractive summarization of news articles. However, these
methods fail to account for long term dependencies within the sentences of a
document. This problem is exacerbated in multi-document summarization tasks
such as summarizing the popular opinion in threads present in community
question answering (CQA) websites such as Yahoo! Answers and Quora. These
threads contain answers which often overlap or contradict each other. In this
work, we present a hierarchical encoder based on structural attention to model
such inter-sentence and inter-document dependencies. We set the popular
pointer-generator architecture and some of the architectures derived from it as
our baselines and show that they fail to generate good summaries in a
multi-document setting. We further illustrate that our proposed model achieves
significant improvement over the baselines in both single and multi-document
summarization settings -- in the former setting, it beats the best baseline by
1.31 and 7.8 ROUGE-1 points on CNN and CQA datasets, respectively; in the
latter setting, the performance is further improved by 1.6 ROUGE-1 points on
the CQA dataset.
Related papers
- REXEL: An End-to-end Model for Document-Level Relation Extraction and Entity Linking [11.374031643273941]
REXEL is a highly efficient and accurate model for the joint task of document level cIE (DocIE)
It is on average 11 times faster than competitive existing approaches in a similar setting.
The combination of speed and accuracy makes REXEL an accurate cost-efficient system for extracting structured information at web-scale.
arXiv Detail & Related papers (2024-04-19T11:04:27Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - ReSel: N-ary Relation Extraction from Scientific Text and Tables by
Learning to Retrieve and Select [53.071352033539526]
We study the problem of extracting N-ary relations from scientific articles.
Our proposed method ReSel decomposes this task into a two-stage procedure.
Our experiments on three scientific information extraction datasets show that ReSel outperforms state-of-the-art baselines significantly.
arXiv Detail & Related papers (2022-10-26T02:28:02Z) - HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long
Document Summarization [17.58231642569116]
We present HIBRIDS, which injects Hierarchical Biases foR incorporating Document Structure into the calculation of attention scores.
We also present a new task, hierarchical question-summary generation, for summarizing salient content in the source document into a hierarchy of questions and summaries.
arXiv Detail & Related papers (2022-03-21T05:27:35Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - Hierarchical Neural Network Approaches for Long Document Classification [3.6700088931938835]
We employ pre-trained Universal Sentence (USE) and Bidirectional Representations from Transformers (BERT) in a hierarchical setup to capture better representations efficiently.
Our proposed models are conceptually simple where we divide the input data into chunks and then pass this through base models of BERT and USE.
We show that USE + CNN/LSTM performs better than its stand-alone baseline. Whereas the BERT + CNN/LSTM performs on par with its stand-alone counterpart.
arXiv Detail & Related papers (2022-01-18T07:17:40Z) - Nested and Balanced Entity Recognition using Multi-Task Learning [0.0]
This paper introduces a partly-layered network architecture that deals with the complexity of overlapping and nested cases.
We train and evaluate this architecture to recognise two kinds of entities - Concepts (CR) and Named Entities (NER)
Our approach achieves state-of-the-art NER performances, while it outperforms previous CR approaches.
arXiv Detail & Related papers (2021-06-11T07:52:32Z) - BASS: Boosting Abstractive Summarization with Unified Semantic Graph [49.48925904426591]
BASS is a framework for Boosting Abstractive Summarization based on a unified Semantic graph.
A graph-based encoder-decoder model is proposed to improve both the document representation and summary generation process.
Empirical results show that the proposed architecture brings substantial improvements for both long-document and multi-document summarization tasks.
arXiv Detail & Related papers (2021-05-25T16:20:48Z) - Data Augmentation for Abstractive Query-Focused Multi-Document
Summarization [129.96147867496205]
We present two QMDS training datasets, which we construct using two data augmentation methods.
These two datasets have complementary properties, i.e., QMDSCNN has real summaries but queries are simulated, while QMDSIR has real queries but simulated summaries.
We build end-to-end neural network models on the combined datasets that yield new state-of-the-art transfer results on DUC datasets.
arXiv Detail & Related papers (2021-03-02T16:57:01Z) - Structured Multimodal Attentions for TextVQA [57.71060302874151]
We propose an end-to-end structured multimodal attention (SMA) neural network to mainly solve the first two issues above.
SMA first uses a structural graph representation to encode the object-object, object-text and text-text relationships appearing in the image, and then designs a multimodal graph attention network to reason over it.
Our proposed model outperforms the SoTA models on TextVQA dataset and two tasks of ST-VQA dataset among all models except pre-training based TAP.
arXiv Detail & Related papers (2020-06-01T07:07:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.