Efficient Few-Shot Fine-Tuning for Opinion Summarization
- URL: http://arxiv.org/abs/2205.02170v1
- Date: Wed, 4 May 2022 16:38:37 GMT
- Title: Efficient Few-Shot Fine-Tuning for Opinion Summarization
- Authors: Arthur Bra\v{z}inskas, Ramesh Nallapati, Mohit Bansal, Markus Dreyer
- Abstract summary: Abstractive summarization models are typically pre-trained on large amounts of generic texts, then fine-tuned on tens or hundreds of thousands of annotated samples.
We show that a few-shot method based on adapters can easily store in-domain knowledge.
We show that this self-supervised adapter pre-training improves summary quality over standard fine-tuning by 2.0 and 1.3 ROUGE-L points on the Amazon and Yelp datasets.
- Score: 83.76460801568092
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Abstractive summarization models are typically pre-trained on large amounts
of generic texts, then fine-tuned on tens or hundreds of thousands of annotated
samples. However, in opinion summarization, large annotated datasets of reviews
paired with reference summaries are not available and would be expensive to
create. This calls for fine-tuning methods robust to overfitting on small
datasets. In addition, generically pre-trained models are often not accustomed
to the specifics of customer reviews and, after fine-tuning, yield summaries
with disfluencies and semantic mistakes. To address these problems, we utilize
an efficient few-shot method based on adapters which, as we show, can easily
store in-domain knowledge. Instead of fine-tuning the entire model, we add
adapters and pre-train them in a task-specific way on a large corpus of
unannotated customer reviews, using held-out reviews as pseudo summaries. Then,
fine-tune the adapters on the small available human-annotated dataset. We show
that this self-supervised adapter pre-training improves summary quality over
standard fine-tuning by 2.0 and 1.3 ROUGE-L points on the Amazon and Yelp
datasets, respectively. Finally, for summary personalization, we condition on
aspect keyword queries, automatically created from generic datasets. In the
same vein, we pre-train the adapters in a query-based manner on customer
reviews and then fine-tune them on annotated datasets. This results in
better-organized summary content reflected in improved coherence and fewer
redundancies.
Related papers
- On Context Utilization in Summarization with Large Language Models [83.84459732796302]
Large language models (LLMs) excel in abstractive summarization tasks, delivering fluent and pertinent summaries.
Recent advancements have extended their capabilities to handle long-input contexts, exceeding 100k tokens.
We conduct the first comprehensive study on context utilization and position bias in summarization.
arXiv Detail & Related papers (2023-10-16T16:45:12Z) - Improving Zero and Few-Shot Abstractive Summarization with Intermediate
Fine-tuning and Data Augmentation [101.26235068460551]
Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks.
Models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains.
We introduce a novel and generalizable method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner.
arXiv Detail & Related papers (2020-10-24T08:36:49Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Align then Summarize: Automatic Alignment Methods for Summarization
Corpus Creation [8.029049649310211]
State-of-the-art on automatic text summarization mostly revolves around news articles.
Our work consists in segmenting and aligning transcriptions with respect to reports, to get a suitable dataset for neural summarization.
We report automatic alignment and summarization performances on a novel corpus of aligned public meetings.
arXiv Detail & Related papers (2020-07-15T17:03:34Z) - Few-Shot Learning for Opinion Summarization [117.70510762845338]
Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents.
In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text.
Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
arXiv Detail & Related papers (2020-04-30T15:37:38Z) - Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof.
At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.