Few-Shot Learning for Opinion Summarization
- URL: http://arxiv.org/abs/2004.14884v3
- Date: Sat, 10 Oct 2020 06:30:38 GMT
- Title: Few-Shot Learning for Opinion Summarization
- Authors: Arthur Bra\v{z}inskas, Mirella Lapata, Ivan Titov
- Abstract summary: Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents.
In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text.
Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
- Score: 117.70510762845338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Opinion summarization is the automatic creation of text reflecting subjective
information expressed in multiple documents, such as user reviews of a product.
The task is practically important and has attracted a lot of attention.
However, due to the high cost of summary production, datasets large enough for
training supervised models are lacking. Instead, the task has been
traditionally approached with extractive methods that learn to select text
fragments in an unsupervised or weakly-supervised way. Recently, it has been
shown that abstractive summaries, potentially more fluent and better at
reflecting conflicting information, can also be produced in an unsupervised
fashion. However, these models, not being exposed to actual summaries, fail to
capture their essential properties. In this work, we show that even a handful
of summaries is sufficient to bootstrap generation of the summary text with all
expected properties, such as writing style, informativeness, fluency, and
sentiment preservation. We start by training a conditional Transformer language
model to generate a new product review given other available reviews of the
product. The model is also conditioned on review properties that are directly
related to summaries; the properties are derived from reviews with no manual
effort. In the second stage, we fine-tune a plug-in module that learns to
predict property values on a handful of summaries. This lets us switch the
generator to the summarization mode. We show on Amazon and Yelp datasets that
our approach substantially outperforms previous extractive and abstractive
methods in automatic and human evaluation.
Related papers
- Template-based Abstractive Microblog Opinion Summarisation [26.777997436856076]
We introduce the task of microblog opinion summarisation (MOS) and share a dataset of 3100 gold-standard opinion summaries.
The dataset contains summaries of tweets spanning a 2-year period and covers more topics than any other public Twitter summarisation dataset.
arXiv Detail & Related papers (2022-08-08T12:16:01Z) - Efficient Few-Shot Fine-Tuning for Opinion Summarization [83.76460801568092]
Abstractive summarization models are typically pre-trained on large amounts of generic texts, then fine-tuned on tens or hundreds of thousands of annotated samples.
We show that a few-shot method based on adapters can easily store in-domain knowledge.
We show that this self-supervised adapter pre-training improves summary quality over standard fine-tuning by 2.0 and 1.3 ROUGE-L points on the Amazon and Yelp datasets.
arXiv Detail & Related papers (2022-05-04T16:38:37Z) - StreamHover: Livestream Transcript Summarization and Annotation [54.41877742041611]
We present StreamHover, a framework for annotating and summarizing livestream transcripts.
With a total of over 500 hours of videos annotated with both extractive and abstractive summaries, our benchmark dataset is significantly larger than currently existing annotated corpora.
We show that our model generalizes better and improves performance over strong baselines.
arXiv Detail & Related papers (2021-09-11T02:19:37Z) - Aspect-Controllable Opinion Summarization [58.5308638148329]
We propose an approach that allows the generation of customized summaries based on aspect queries.
Using a review corpus, we create a synthetic training dataset of (review, summary) pairs enriched with aspect controllers.
We fine-tune a pretrained model using our synthetic dataset and generate aspect-specific summaries by modifying the aspect controllers.
arXiv Detail & Related papers (2021-09-07T16:09:17Z) - To Point or Not to Point: Understanding How Abstractive Summarizers
Paraphrase Text [4.4044968357361745]
We characterize how one popular abstractive model, the pointer-generator model of See et al., uses its explicit copy/generation switch to control its level of abstraction.
When we modify the copy/generation switch and force the model to generate, only simple neural abilities are revealed alongside factual inaccuracies and hallucinations.
In line with previous research, these results suggest that abstractive summarization models lack the semantic understanding necessary to generate paraphrases that are both abstractive and faithful to the source document.
arXiv Detail & Related papers (2021-06-03T04:03:15Z) - Automated News Summarization Using Transformers [4.932130498861987]
We will be presenting a comprehensive comparison of a few transformer architecture based pre-trained models for text summarization.
For analysis and comparison, we have used the BBC news dataset that contains text data that can be used for summarization and human generated summaries.
arXiv Detail & Related papers (2021-04-23T04:22:33Z) - Transductive Learning for Abstractive News Summarization [24.03781438153328]
We propose the first application of transductive learning to summarization.
We show that our approach yields state-of-the-art results on CNN/DM and NYT datasets.
arXiv Detail & Related papers (2021-04-17T17:33:12Z) - Unsupervised Opinion Summarization with Content Planning [58.5308638148329]
We show that explicitly incorporating content planning in a summarization model yields output of higher quality.
We also create synthetic datasets which are more natural, resembling real world document-summary pairs.
Our approach outperforms competitive models in generating informative, coherent, and fluent summaries.
arXiv Detail & Related papers (2020-12-14T18:41:58Z) - Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof.
At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.