How well do you know your summarization datasets?
- URL: http://arxiv.org/abs/2106.11388v1
- Date: Mon, 21 Jun 2021 19:44:06 GMT
- Title: How well do you know your summarization datasets?
- Authors: Priyam Tejaswin, Dhruv Naik, Pengfei Liu
- Abstract summary: We analyze 600 samples from three popular summarization datasets.
We follow with a thorough analysis of 27 state-of-the-art summarization models and 5 popular metrics.
- Score: 11.992125069326772
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art summarization systems are trained and evaluated on massive
datasets scraped from the web. Despite their prevalence, we know very little
about the underlying characteristics (data noise, summarization complexity,
etc.) of these datasets, and how these affect system performance and the
reliability of automatic metrics like ROUGE. In this study, we manually analyze
600 samples from three popular summarization datasets. Our study is driven by a
six-class typology which captures different noise types (missing facts,
entities) and degrees of summarization difficulty (extractive, abstractive). We
follow with a thorough analysis of 27 state-of-the-art summarization models and
5 popular metrics, and report our key insights: (1) Datasets have distinct data
quality and complexity distributions, which can be traced back to their
collection process. (2) The performance of models and reliability of metrics is
dependent on sample complexity. (3) Faithful summaries often receive low scores
because of the poor diversity of references. We release the code, annotated
data and model outputs.
Related papers
- Common-Sense Bias Modeling for Classification Tasks [15.683471433842492]
We propose a novel framework to extract comprehensive biases in image datasets based on textual descriptions.
Our method uncovers novel model biases in multiple image benchmark datasets.
The discovered bias can be mitigated by simple data re-weighting to de-correlate the features.
arXiv Detail & Related papers (2024-01-24T03:56:07Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - Quantified Sleep: Machine learning techniques for observational n-of-1
studies [0.0]
This paper applies statistical learning techniques to an observational Quantified-Self study to build a descriptive model of sleep quality.
Sleep quality is one of the most difficult modelling targets in QS research, due to high noise and a large number of weakly-contributing factors.
arXiv Detail & Related papers (2021-05-14T13:13:17Z) - Unsupervised Opinion Summarization with Content Planning [58.5308638148329]
We show that explicitly incorporating content planning in a summarization model yields output of higher quality.
We also create synthetic datasets which are more natural, resembling real world document-summary pairs.
Our approach outperforms competitive models in generating informative, coherent, and fluent summaries.
arXiv Detail & Related papers (2020-12-14T18:41:58Z) - Improving Zero and Few-Shot Abstractive Summarization with Intermediate
Fine-tuning and Data Augmentation [101.26235068460551]
Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks.
Models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains.
We introduce a novel and generalizable method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner.
arXiv Detail & Related papers (2020-10-24T08:36:49Z) - CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural
Summarization Systems [121.78477833009671]
We investigate the performance of different summarization models under a cross-dataset setting.
A comprehensive study of 11 representative summarization systems on 5 datasets from different domains reveals the effect of model architectures and generation ways.
arXiv Detail & Related papers (2020-10-11T02:19:15Z) - Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven
Cloze Reward [42.925345819778656]
We present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD.
We propose the use of dual encoders---a sequential document encoder and a graph-structured encoder---to maintain the global context and local characteristics of entities.
Results show that our models produce significantly higher ROUGE scores than a variant without knowledge graph as input on both New York Times and CNN/Daily Mail datasets.
arXiv Detail & Related papers (2020-05-03T18:23:06Z) - Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof.
At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.