Automatic Summarization of Open-Domain Podcast Episodes
- URL: http://arxiv.org/abs/2011.04132v2
- Date: Thu, 12 Nov 2020 17:34:35 GMT
- Title: Automatic Summarization of Open-Domain Podcast Episodes
- Authors: Kaiqiang Song and Chen Li and Xiaoyang Wang and Dong Yu and Fei Liu
- Abstract summary: We present implementation details of our abstractive summarizers that achieve competitive results on the Podcast Summarization task of TREC 2020.
Our best system achieves a quality rating of 1.559 judged by NIST evaluators---an absolute increase of 0.268 (+21%) over the creator descriptions.
- Score: 33.268079036601634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present implementation details of our abstractive summarizers that achieve
competitive results on the Podcast Summarization task of TREC 2020. A concise
textual summary that captures important information is crucial for users to
decide whether to listen to the podcast. Prior work focuses primarily on
learning contextualized representations. Instead, we investigate several
less-studied aspects of neural abstractive summarization, including (i) the
importance of selecting important segments from transcripts to serve as input
to the summarizer; (ii) striking a balance between the amount and quality of
training instances; (iii) the appropriate summary length and start/end points.
We highlight the design considerations behind our system and offer key insights
into the strengths and weaknesses of neural abstractive systems. Our results
suggest that identifying important segments from transcripts to use as input to
an abstractive summarizer is advantageous for summarizing long documents. Our
best system achieves a quality rating of 1.559 judged by NIST evaluators---an
absolute increase of 0.268 (+21%) over the creator descriptions.
Related papers
- SummIt: Iterative Text Summarization via ChatGPT [12.966825834765814]
We propose SummIt, an iterative text summarization framework based on large language models like ChatGPT.
Our framework enables the model to refine the generated summary iteratively through self-evaluation and feedback.
We also conduct a human evaluation to validate the effectiveness of the iterative refinements and identify a potential issue of over-correction.
arXiv Detail & Related papers (2023-05-24T07:40:06Z) - Salience Allocation as Guidance for Abstractive Summarization [61.31826412150143]
We propose a novel summarization approach with a flexible and reliable salience guidance, namely SEASON (SaliencE Allocation as Guidance for Abstractive SummarizatiON)
SEASON utilizes the allocation of salience expectation to guide abstractive summarization and adapts well to articles in different abstractiveness.
arXiv Detail & Related papers (2022-10-22T02:13:44Z) - Podcast Summary Assessment: A Resource for Evaluating Summary Assessment
Methods [42.08097583183816]
We describe a new dataset, the podcast summary assessment corpus.
This dataset has two unique aspects: (i) long-input, speech podcast based, documents; and (ii) an opportunity to detect inappropriate reference summaries in podcast corpus.
arXiv Detail & Related papers (2022-08-28T18:24:41Z) - Towards Abstractive Grounded Summarization of Podcast Transcripts [33.268079036601634]
Summarization of podcast transcripts is of practical benefit to both content providers and consumers.
It helps consumers to quickly decide whether they will listen to the podcasts and reduces the load of content providers to write summaries.
However, podcast summarization faces significant challenges including factual inconsistencies with respect to the inputs.
arXiv Detail & Related papers (2022-03-22T02:44:39Z) - Learning Opinion Summarizers by Selecting Informative Reviews [81.47506952645564]
We collect a large dataset of summaries paired with user reviews for over 31,000 products, enabling supervised training.
The content of many reviews is not reflected in the human-written summaries, and, thus, the summarizer trained on random review subsets hallucinates.
We formulate the task as jointly learning to select informative subsets of reviews and summarizing the opinions expressed in these subsets.
arXiv Detail & Related papers (2021-09-09T15:01:43Z) - Controllable Abstractive Dialogue Summarization with Sketch Supervision [56.59357883827276]
Our model achieves state-of-the-art performance on the largest dialogue summarization corpus SAMSum, with as high as 50.79 in ROUGE-L score.
arXiv Detail & Related papers (2021-05-28T19:05:36Z) - What Makes a Good Summary? Reconsidering the Focus of Automatic
Summarization [49.600619575148706]
We find that the current focus of the field does not fully align with participants' wishes.
Based on our findings, we argue that it is important to adopt a broader perspective on automatic summarization.
arXiv Detail & Related papers (2020-12-14T15:12:35Z) - Screenplay Summarization Using Latent Narrative Structure [78.45316339164133]
We propose to explicitly incorporate the underlying structure of narratives into general unsupervised and supervised extractive summarization models.
We formalize narrative structure in terms of key narrative events (turning points) and treat it as latent in order to summarize screenplays.
Experimental results on the CSI corpus of TV screenplays, which we augment with scene-level summarization labels, show that latent turning points correlate with important aspects of a CSI episode.
arXiv Detail & Related papers (2020-04-27T11:54:19Z) - Salience Estimation with Multi-Attention Learning for Abstractive Text
Summarization [86.45110800123216]
In the task of text summarization, salience estimation for words, phrases or sentences is a critical component.
We propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation.
arXiv Detail & Related papers (2020-04-07T02:38:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.