Generating EDU Extracts for Plan-Guided Summary Re-Ranking
- URL: http://arxiv.org/abs/2305.17779v1
- Date: Sun, 28 May 2023 17:22:04 GMT
- Title: Generating EDU Extracts for Plan-Guided Summary Re-Ranking
- Authors: Griffin Adams, Alexander R. Fabbri, Faisal Ladhak, Kathleen McKeown,
No\'emie Elhadad
- Abstract summary: Two-step approaches, in which summary candidates are generated-then-reranked to return a single summary, can improve ROUGE scores over the standard single-step approach.
We design a novel method to generate candidates for re-ranking that addresses these issues.
We show large relevance improvements over previously published methods on widely used single document news article corpora.
- Score: 77.7752504102925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Two-step approaches, in which summary candidates are generated-then-reranked
to return a single summary, can improve ROUGE scores over the standard
single-step approach. Yet, standard decoding methods (i.e., beam search,
nucleus sampling, and diverse beam search) produce candidates with redundant,
and often low quality, content. In this paper, we design a novel method to
generate candidates for re-ranking that addresses these issues. We ground each
candidate abstract on its own unique content plan and generate distinct
plan-guided abstracts using a model's top beam. More concretely, a standard
language model (a BART LM) auto-regressively generates elemental discourse unit
(EDU) content plans with an extractive copy mechanism. The top K beams from the
content plan generator are then used to guide a separate LM, which produces a
single abstractive candidate for each distinct plan. We apply an existing
re-ranker (BRIO) to abstractive candidates generated from our method, as well
as baseline decoding methods. We show large relevance improvements over
previously published methods on widely used single document news article
corpora, with ROUGE-2 F1 gains of 0.88, 2.01, and 0.38 on CNN / Dailymail, NYT,
and Xsum, respectively. A human evaluation on CNN / DM validates these results.
Similarly, on 1k samples from CNN / DM, we show that prompting GPT-3 to follow
EDU plans outperforms sampling-based methods by 1.05 ROUGE-2 F1 points. Code to
generate and realize plans is available at
https://github.com/griff4692/edu-sum.
Related papers
- Integrating Planning into Single-Turn Long-Form Text Generation [66.08871753377055]
We propose to use planning to generate long form content.
Our main novelty lies in a single auxiliary task that does not require multiple rounds of prompting or planning.
Our experiments demonstrate on two datasets from different domains, that LLMs fine-tuned with the auxiliary task generate higher quality documents.
arXiv Detail & Related papers (2024-10-08T17:02:40Z) - AugSumm: towards generalizable speech summarization using synthetic
labels from large language model [61.73741195292997]
Abstractive speech summarization (SSUM) aims to generate human-like summaries from speech.
conventional SSUM models are mostly trained and evaluated with a single ground-truth (GT) human-annotated deterministic summary.
We propose AugSumm, a method to leverage large language models (LLMs) as a proxy for human annotators to generate augmented summaries.
arXiv Detail & Related papers (2024-01-10T18:39:46Z) - Absformer: Transformer-based Model for Unsupervised Multi-Document
Abstractive Summarization [1.066048003460524]
Multi-document summarization (MDS) refers to the task of summarizing the text in multiple documents into a concise summary.
Abstractive MDS aims to generate a coherent and fluent summary for multiple documents using natural language generation techniques.
We propose Absformer, a new Transformer-based method for unsupervised abstractive summary generation.
arXiv Detail & Related papers (2023-06-07T21:18:23Z) - Element-aware Summarization with Large Language Models: Expert-aligned
Evaluation and Chain-of-Thought Method [35.181659789684545]
Automatic summarization generates concise summaries that contain key ideas of source documents.
References from CNN/DailyMail and BBC XSum are noisy, mainly in terms of factual hallucination and information redundancy.
We propose a Summary Chain-of-Thought (SumCoT) technique to elicit LLMs to generate summaries step by step.
Experimental results show our method outperforms state-of-the-art fine-tuned PLMs and zero-shot LLMs by +4.33/+4.77 in ROUGE-L.
arXiv Detail & Related papers (2023-05-22T18:54:35Z) - Towards Abstractive Timeline Summarisation using Preference-based
Reinforcement Learning [3.6640004265358477]
This paper introduces a novel pipeline for summarising timelines of events reported by multiple news sources.
Transformer-based models for abstractive summarisation generate coherent and concise summaries of long documents.
While extractive summaries are more faithful to their sources, they may be less readable and contain redundant or unnecessary information.
arXiv Detail & Related papers (2022-11-14T18:24:13Z) - Towards Summary Candidates Fusion [26.114829566197976]
We propose a new paradigm in second-stage abstractive summarization called SummaFusion.
It fuses several summary candidates to produce a novel abstractive second-stage summary.
Our method works well on several summarization datasets, improving both the ROUGE scores and qualitative properties of fused summaries.
arXiv Detail & Related papers (2022-10-17T06:48:05Z) - Summarization Programs: Interpretable Abstractive Summarization with
Neural Modular Trees [89.60269205320431]
Current abstractive summarization models either suffer from a lack of clear interpretability or provide incomplete rationales.
We propose the Summarization Program (SP), an interpretable modular framework consisting of an (ordered) list of binary trees.
A Summarization Program contains one root node per summary sentence, and a distinct tree connects each summary sentence to the document sentences.
arXiv Detail & Related papers (2022-09-21T16:50:22Z) - SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for
Abstractive Summarization [26.114829566197976]
We show that it is possible to directly train a second-stage model performing re-ranking on a set of summary candidates.
Our mixture-of-experts SummaReranker learns to select a better candidate and consistently improves the performance of the base model.
arXiv Detail & Related papers (2022-03-13T05:05:10Z) - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation [100.79870384880333]
We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text.
We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
arXiv Detail & Related papers (2020-10-05T19:59:05Z) - SummPip: Unsupervised Multi-Document Summarization with Sentence Graph
Compression [61.97200991151141]
SummPip is an unsupervised method for multi-document summarization.
We convert the original documents to a sentence graph, taking both linguistic and deep representation into account.
We then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary.
arXiv Detail & Related papers (2020-07-17T13:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.