HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long
Document Summarization
- URL: http://arxiv.org/abs/2203.10741v1
- Date: Mon, 21 Mar 2022 05:27:35 GMT
- Title: HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long
Document Summarization
- Authors: Shuyang Cao and Lu Wang
- Abstract summary: We present HIBRIDS, which injects Hierarchical Biases foR incorporating Document Structure into the calculation of attention scores.
We also present a new task, hierarchical question-summary generation, for summarizing salient content in the source document into a hierarchy of questions and summaries.
- Score: 17.58231642569116
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Document structure is critical for efficient information consumption.
However, it is challenging to encode it efficiently into the modern Transformer
architecture. In this work, we present HIBRIDS, which injects Hierarchical
Biases foR Incorporating Document Structure into the calculation of attention
scores. We further present a new task, hierarchical question-summary
generation, for summarizing salient content in the source document into a
hierarchy of questions and summaries, where each follow-up question inquires
about the content of its parent question-summary pair. We also annotate a new
dataset with 6,153 question-summary hierarchies labeled on long government
reports. Experiment results show that our model produces better
question-summary hierarchies than comparisons on both hierarchy quality and
content coverage, a finding also echoed by human judges. Additionally, our
model improves the generation of long-form summaries from lengthy government
reports and Wikipedia articles, as measured by ROUGE scores.
Related papers
- HDT: Hierarchical Document Transformer [70.2271469410557]
HDT exploits document structure by introducing auxiliary anchor tokens and redesigning the attention mechanism into a sparse multi-level hierarchy.
We develop a novel sparse attention kernel that considers the hierarchical structure of documents.
arXiv Detail & Related papers (2024-07-11T09:28:04Z) - Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval [26.527911244587134]
We introduce the novel approach of embedding, clustering, and summarizing chunks of text, constructing a tree with differing levels of summarization from the bottom up.
At inference time, our RAPTOR model retrieves from this tree, integrating information across lengthy documents at different levels of abstraction.
arXiv Detail & Related papers (2024-01-31T18:30:21Z) - PDFTriage: Question Answering over Long, Structured Documents [60.96667912964659]
Representing structured documents as plain text is incongruous with the user's mental model of these documents with rich structure.
We propose PDFTriage that enables models to retrieve the context based on either structure or content.
Our benchmark dataset consists of 900+ human-generated questions over 80 structured documents.
arXiv Detail & Related papers (2023-09-16T04:29:05Z) - Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text
Documents via Semantic-Oriented Hierarchical Graphs [79.0426838808629]
We propose TAT-DQA, i.e. to answer the question over a visually-rich table-text document.
Specifically, we propose a novel Doc2SoarGraph framework with enhanced discrete reasoning capability.
We conduct extensive experiments on TAT-DQA dataset, and the results show that our proposed framework outperforms the best baseline model by 17.73% and 16.91% in terms of Exact Match (EM) and F1 score respectively on the test set.
arXiv Detail & Related papers (2023-05-03T07:30:32Z) - Document-Level Abstractive Summarization [0.0]
We study how efficient Transformer techniques can be used to improve the automatic summarization of very long texts.
We propose a novel retrieval-enhanced approach which reduces the cost of generating a summary of the entire document by processing smaller chunks.
arXiv Detail & Related papers (2022-12-06T14:39:09Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - Text Summarization with Latent Queries [60.468323530248945]
We introduce LaQSum, the first unified text summarization system that learns Latent Queries from documents for abstractive summarization with any existing query forms.
Under a deep generative framework, our system jointly optimize a latent query model and a conditional language model, allowing users to plug-and-play queries of any type at test time.
Our system robustly outperforms strong comparison systems across summarization benchmarks with different query types, document settings, and target domains.
arXiv Detail & Related papers (2021-05-31T21:14:58Z) - On Generating Extended Summaries of Long Documents [16.149617108647707]
We present a new method for generating extended summaries of long papers.
Our method exploits hierarchical structure of the documents and incorporates it into an extractive summarization model.
Our analysis shows that our multi-tasking approach can adjust extraction probability distribution to the favor of summary-worthy sentences.
arXiv Detail & Related papers (2020-12-28T08:10:28Z) - Neural Abstractive Summarization with Structural Attention [31.50918718905953]
We present a hierarchical encoder based on structural attention to model such inter-sentence and inter-document dependencies.
We show that our proposed model achieves significant improvement over the baselines in both single and multi-document summarization settings.
arXiv Detail & Related papers (2020-04-21T03:39:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.