Unsupervised Opinion Summarization with Content Planning
- URL: http://arxiv.org/abs/2012.07808v1
- Date: Mon, 14 Dec 2020 18:41:58 GMT
- Title: Unsupervised Opinion Summarization with Content Planning
- Authors: Reinald Kim Amplayo, Stefanos Angelidis, Mirella Lapata
- Abstract summary: We show that explicitly incorporating content planning in a summarization model yields output of higher quality.
We also create synthetic datasets which are more natural, resembling real world document-summary pairs.
Our approach outperforms competitive models in generating informative, coherent, and fluent summaries.
- Score: 58.5308638148329
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The recent success of deep learning techniques for abstractive summarization
is predicated on the availability of large-scale datasets. When summarizing
reviews (e.g., for products or movies), such training data is neither available
nor can be easily sourced, motivating the development of methods which rely on
synthetic datasets for supervised training. We show that explicitly
incorporating content planning in a summarization model not only yields output
of higher quality, but also allows the creation of synthetic datasets which are
more natural, resembling real world document-summary pairs. Our content plans
take the form of aspect and sentiment distributions which we induce from data
without access to expensive annotations. Synthetic datasets are created by
sampling pseudo-reviews from a Dirichlet distribution parametrized by our
content planner, while our model generates summaries based on input reviews and
induced content plans. Experimental results on three domains show that our
approach outperforms competitive models in generating informative, coherent,
and fluent summaries that capture opinion consensus.
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - ACLSum: A New Dataset for Aspect-based Summarization of Scientific
Publications [10.529898520273063]
ACLSum is a novel summarization dataset carefully crafted and evaluated by domain experts.
In contrast to previous datasets, ACLSum facilitates multi-aspect summarization of scientific papers.
arXiv Detail & Related papers (2024-03-08T13:32:01Z) - Template-based Abstractive Microblog Opinion Summarisation [26.777997436856076]
We introduce the task of microblog opinion summarisation (MOS) and share a dataset of 3100 gold-standard opinion summaries.
The dataset contains summaries of tweets spanning a 2-year period and covers more topics than any other public Twitter summarisation dataset.
arXiv Detail & Related papers (2022-08-08T12:16:01Z) - Improving the Faithfulness of Abstractive Summarization via Entity
Coverage Control [27.214742188672464]
We propose a method to remedy entity-level hallucinations with Entity Coverage Control (ECC)
ECC computes entity coverage precision and prepend the corresponding control code for each training example.
We show that the proposed method leads to more faithful and salient abstractive summarization in supervised fine-tuning and zero-shot settings.
arXiv Detail & Related papers (2022-07-05T18:52:19Z) - Aspect-Controllable Opinion Summarization [58.5308638148329]
We propose an approach that allows the generation of customized summaries based on aspect queries.
Using a review corpus, we create a synthetic training dataset of (review, summary) pairs enriched with aspect controllers.
We fine-tune a pretrained model using our synthetic dataset and generate aspect-specific summaries by modifying the aspect controllers.
arXiv Detail & Related papers (2021-09-07T16:09:17Z) - Improving Zero and Few-Shot Abstractive Summarization with Intermediate
Fine-tuning and Data Augmentation [101.26235068460551]
Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks.
Models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains.
We introduce a novel and generalizable method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner.
arXiv Detail & Related papers (2020-10-24T08:36:49Z) - Few-Shot Learning for Opinion Summarization [117.70510762845338]
Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents.
In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text.
Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
arXiv Detail & Related papers (2020-04-30T15:37:38Z) - Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof.
At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.