Learning Opinion Summarizers by Selecting Informative Reviews
- URL: http://arxiv.org/abs/2109.04325v1
- Date: Thu, 9 Sep 2021 15:01:43 GMT
- Title: Learning Opinion Summarizers by Selecting Informative Reviews
- Authors: Arthur Bra\v{z}inskas, Mirella Lapata, Ivan Titov
- Abstract summary: We collect a large dataset of summaries paired with user reviews for over 31,000 products, enabling supervised training.
The content of many reviews is not reflected in the human-written summaries, and, thus, the summarizer trained on random review subsets hallucinates.
We formulate the task as jointly learning to select informative subsets of reviews and summarizing the opinions expressed in these subsets.
- Score: 81.47506952645564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Opinion summarization has been traditionally approached with unsupervised,
weakly-supervised and few-shot learning techniques. In this work, we collect a
large dataset of summaries paired with user reviews for over 31,000 products,
enabling supervised training. However, the number of reviews per product is
large (320 on average), making summarization - and especially training a
summarizer - impractical. Moreover, the content of many reviews is not
reflected in the human-written summaries, and, thus, the summarizer trained on
random review subsets hallucinates. In order to deal with both of these
challenges, we formulate the task as jointly learning to select informative
subsets of reviews and summarizing the opinions expressed in these subsets. The
choice of the review subset is treated as a latent variable, predicted by a
small and simple selector. The subset is then fed into a more powerful
summarizer. For joint training, we use amortized variational inference and
policy gradient methods. Our experiments demonstrate the importance of
selecting informative reviews resulting in improved quality of summaries and
reduced hallucinations.
Related papers
- Incremental Extractive Opinion Summarization Using Cover Trees [81.59625423421355]
In online marketplaces user reviews accumulate over time, and opinion summaries need to be updated periodically.
In this work, we study the task of extractive opinion summarization in an incremental setting.
We present an efficient algorithm for accurately computing the CentroidRank summaries in an incremental setting.
arXiv Detail & Related papers (2024-01-16T02:00:17Z) - Large-Scale and Multi-Perspective Opinion Summarization with Diverse
Review Subsets [23.515892409202344]
SUBSUMM is a supervised summarization framework for large-scale multi-perspective opinion summarization.
It generates pros, cons, and verdict summaries from hundreds of input reviews.
arXiv Detail & Related papers (2023-10-20T08:08:13Z) - OpineSum: Entailment-based self-training for abstractive opinion
summarization [6.584115526134759]
We present a novel self-training approach, OpineSum, for abstractive opinion summarization.
The summaries in this approach are built using a novel application of textual entailment.
OpineSum achieves state-of-the-art performance in both settings.
arXiv Detail & Related papers (2022-12-21T06:20:28Z) - Comparing Methods for Extractive Summarization of Call Centre Dialogue [77.34726150561087]
We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively.
We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations.
arXiv Detail & Related papers (2022-09-06T13:16:02Z) - Unsupervised Reference-Free Summary Quality Evaluation via Contrastive
Learning [66.30909748400023]
We propose to evaluate the summary qualities without reference summaries by unsupervised contrastive learning.
Specifically, we design a new metric which covers both linguistic qualities and semantic informativeness based on BERT.
Experiments on Newsroom and CNN/Daily Mail demonstrate that our new evaluation method outperforms other metrics even without reference summaries.
arXiv Detail & Related papers (2020-10-05T05:04:14Z) - Topic Detection and Summarization of User Reviews [6.779855791259679]
We propose an effective new summarization method by analyzing both reviews and summaries.
A new dataset comprising product reviews and summaries about 1028 products are collected from Amazon and CNET.
arXiv Detail & Related papers (2020-05-30T02:19:08Z) - Few-Shot Learning for Opinion Summarization [117.70510762845338]
Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents.
In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text.
Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
arXiv Detail & Related papers (2020-04-30T15:37:38Z) - Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof.
At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.