From Sparse to Dense: GPT-4 Summarization with Chain of Density
Prompting
- URL: http://arxiv.org/abs/2309.04269v1
- Date: Fri, 8 Sep 2023 11:31:08 GMT
- Title: From Sparse to Dense: GPT-4 Summarization with Chain of Density
Prompting
- Authors: Griffin Adams, Alexander Fabbri, Faisal Ladhak, Eric Lehman, No\'emie
Elhadad
- Abstract summary: A good summary should be detailed and entity-centric without being overly dense and hard to follow.
We solicit increasingly dense GPT-4 summaries with what we refer to as a Chain of Density'' prompt.
We conduct a human preference study on 100 CNN DailyMail articles and find that that humans prefer GPT-4 summaries that are more dense than those generated by a vanilla prompt.
- Score: 57.25154420382581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Selecting the ``right'' amount of information to include in a summary is a
difficult task. A good summary should be detailed and entity-centric without
being overly dense and hard to follow. To better understand this tradeoff, we
solicit increasingly dense GPT-4 summaries with what we refer to as a ``Chain
of Density'' (CoD) prompt. Specifically, GPT-4 generates an initial
entity-sparse summary before iteratively incorporating missing salient entities
without increasing the length. Summaries generated by CoD are more abstractive,
exhibit more fusion, and have less of a lead bias than GPT-4 summaries
generated by a vanilla prompt. We conduct a human preference study on 100 CNN
DailyMail articles and find that that humans prefer GPT-4 summaries that are
more dense than those generated by a vanilla prompt and almost as dense as
human written summaries. Qualitative analysis supports the notion that there
exists a tradeoff between informativeness and readability. 500 annotated CoD
summaries, as well as an extra 5,000 unannotated summaries, are freely
available on HuggingFace
(https://huggingface.co/datasets/griffin/chain_of_density).
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - GUMsley: Evaluating Entity Salience in Summarization for 12 English
Genres [14.37990666928991]
We present and evaluate GUMsley, the first entity salience dataset covering all named and non-named salient entities for 12 genres of English text.
We show that predicting or providing salient entities to several model architectures enhances performance and helps derive higher-quality summaries.
arXiv Detail & Related papers (2024-01-31T16:30:50Z) - AugSumm: towards generalizable speech summarization using synthetic
labels from large language model [61.73741195292997]
Abstractive speech summarization (SSUM) aims to generate human-like summaries from speech.
conventional SSUM models are mostly trained and evaluated with a single ground-truth (GT) human-annotated deterministic summary.
We propose AugSumm, a method to leverage large language models (LLMs) as a proxy for human annotators to generate augmented summaries.
arXiv Detail & Related papers (2024-01-10T18:39:46Z) - On Context Utilization in Summarization with Large Language Models [83.84459732796302]
Large language models (LLMs) excel in abstractive summarization tasks, delivering fluent and pertinent summaries.
Recent advancements have extended their capabilities to handle long-input contexts, exceeding 100k tokens.
We conduct the first comprehensive study on context utilization and position bias in summarization.
arXiv Detail & Related papers (2023-10-16T16:45:12Z) - Question-Answering Approach to Evaluating Legal Summaries [0.43512163406551996]
GPT-4 is used to generate a set of question-answer pairs that cover main points and information in the reference summary.
GPT-4 is then used to generate answers based on the generated summary for the questions from the reference summary.
GPT-4 grades the answers from the reference summary and the generated summary.
arXiv Detail & Related papers (2023-09-26T15:36:29Z) - Extractive is not Faithful: An Investigation of Broad Unfaithfulness
Problems in Extractive Summarization [91.86501509439815]
In this work, we define a typology with five types of broad unfaithfulness problems that can appear in extractive summaries.
We ask humans to label these problems out of 1600 English summaries produced by 16 diverse extractive systems.
To automatically detect these problems, we find that 5 existing faithfulness evaluation metrics for summarization have poor correlations with human judgment.
arXiv Detail & Related papers (2022-09-08T03:25:18Z) - Screenplay Summarization Using Latent Narrative Structure [78.45316339164133]
We propose to explicitly incorporate the underlying structure of narratives into general unsupervised and supervised extractive summarization models.
We formalize narrative structure in terms of key narrative events (turning points) and treat it as latent in order to summarize screenplays.
Experimental results on the CSI corpus of TV screenplays, which we augment with scene-level summarization labels, show that latent turning points correlate with important aspects of a CSI episode.
arXiv Detail & Related papers (2020-04-27T11:54:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.