Interpretable Multi-Headed Attention for Abstractive Summarization at
Controllable Lengths
- URL: http://arxiv.org/abs/2002.07845v2
- Date: Fri, 27 Nov 2020 21:22:14 GMT
- Title: Interpretable Multi-Headed Attention for Abstractive Summarization at
Controllable Lengths
- Authors: Ritesh Sarkhel, Moniba Keymanesh, Arnab Nandi, Srinivasan
Parthasarathy
- Abstract summary: Multi-level Summarizer (MLS) is a supervised method to construct abstractive summaries of a text document at controllable lengths.
MLS outperforms strong baselines by up to 14.70% in the METEOR score.
- Score: 14.762731718325002
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abstractive summarization at controllable lengths is a challenging task in
natural language processing. It is even more challenging for domains where
limited training data is available or scenarios in which the length of the
summary is not known beforehand. At the same time, when it comes to trusting
machine-generated summaries, explaining how a summary was constructed in
human-understandable terms may be critical. We propose Multi-level Summarizer
(MLS), a supervised method to construct abstractive summaries of a text
document at controllable lengths. The key enabler of our method is an
interpretable multi-headed attention mechanism that computes attention
distribution over an input document using an array of timestep independent
semantic kernels. Each kernel optimizes a human-interpretable syntactic or
semantic property. Exhaustive experiments on two low-resource datasets in the
English language show that MLS outperforms strong baselines by up to 14.70% in
the METEOR score. Human evaluation of the summaries also suggests that they
capture the key concepts of the document at various length-budgets.
Related papers
- Hierarchical Indexing for Retrieval-Augmented Opinion Summarization [60.5923941324953]
We propose a method for unsupervised abstractive opinion summarization that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs)
Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy.
At inference time, we populate the index and use it to identify and retrieve clusters of sentences containing popular opinions from input reviews.
arXiv Detail & Related papers (2024-03-01T10:38:07Z) - Unsupervised Extractive Summarization with Learnable Length Control
Strategies [33.75745103050596]
Unsupervised extractive summarization is an important technique in information extraction and retrieval.
Most of existing unsupervised methods rely on graph-based ranking on sentence centrality.
This paper introduces an unsupervised extractive summarization model based on a siamese network.
arXiv Detail & Related papers (2023-12-12T00:15:26Z) - On Context Utilization in Summarization with Large Language Models [83.84459732796302]
Large language models (LLMs) excel in abstractive summarization tasks, delivering fluent and pertinent summaries.
Recent advancements have extended their capabilities to handle long-input contexts, exceeding 100k tokens.
We conduct the first comprehensive study on context utilization and position bias in summarization.
arXiv Detail & Related papers (2023-10-16T16:45:12Z) - MACSum: Controllable Summarization with Mixed Attributes [56.685735509260276]
MACSum is the first human-annotated summarization dataset for controlling mixed attributes.
We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization.
arXiv Detail & Related papers (2022-11-09T17:17:37Z) - Unsupervised Summarization with Customized Granularities [76.26899748972423]
We propose the first unsupervised multi-granularity summarization framework, GranuSum.
By inputting different numbers of events, GranuSum is capable of producing multi-granular summaries in an unsupervised manner.
arXiv Detail & Related papers (2022-01-29T05:56:35Z) - Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues
and Documents [13.755637074366813]
SummN is a simple, flexible, and effective multi-stage framework for input texts longer than the maximum context lengths of typical pretrained LMs.
It can process input text of arbitrary length by adjusting the number of stages while keeping the LM context size fixed.
Our experiments demonstrate that SummN significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2021-10-16T06:19:54Z) - Unsupervised Extractive Summarization using Pointwise Mutual Information [5.544401446569243]
We propose new metrics of relevance and redundancy using pointwise mutual information (PMI) between sentences.
We show that our method outperforms similarity-based methods on datasets in a range of domains including news, medical journal articles, and personal anecdotes.
arXiv Detail & Related papers (2021-02-11T21:05:50Z) - An Enhanced MeanSum Method For Generating Hotel Multi-Review
Summarizations [0.06091702876917279]
This work uses Multi-Aspect Masker(MAM) as content selector to address the issue with multi-aspect.
We also propose a regularizer to control the length of the generated summaries.
Our improved model achieves higher ROUGE, Sentiment Accuracy than the original Meansum method.
arXiv Detail & Related papers (2020-12-07T13:16:01Z) - SummPip: Unsupervised Multi-Document Summarization with Sentence Graph
Compression [61.97200991151141]
SummPip is an unsupervised method for multi-document summarization.
We convert the original documents to a sentence graph, taking both linguistic and deep representation into account.
We then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary.
arXiv Detail & Related papers (2020-07-17T13:01:15Z) - Salience Estimation with Multi-Attention Learning for Abstractive Text
Summarization [86.45110800123216]
In the task of text summarization, salience estimation for words, phrases or sentences is a critical component.
We propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation.
arXiv Detail & Related papers (2020-04-07T02:38:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.