Corpora Evaluation and System Bias Detection in Multi-document
Summarization
- URL: http://arxiv.org/abs/2010.01786v1
- Date: Mon, 5 Oct 2020 05:25:43 GMT
- Title: Corpora Evaluation and System Bias Detection in Multi-document
Summarization
- Authors: Alvin Dey, Tanya Chowdhury, Yash Kumar Atri, Tanmoy Chakraborty
- Abstract summary: Multi-document summarization (MDS) is the task of reflecting key points from any set of documents into a concise text paragraph.
Owing to no standard definition of the task, we encounter a plethora of datasets with varying levels of overlap and conflict between participating documents.
New systems report results on a set of chosen datasets, which might not correlate with their performance on the other datasets.
- Score: 25.131744693121508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-document summarization (MDS) is the task of reflecting key points from
any set of documents into a concise text paragraph. In the past, it has been
used to aggregate news, tweets, product reviews, etc. from various sources.
Owing to no standard definition of the task, we encounter a plethora of
datasets with varying levels of overlap and conflict between participating
documents. There is also no standard regarding what constitutes summary
information in MDS. Adding to the challenge is the fact that new systems report
results on a set of chosen datasets, which might not correlate with their
performance on the other datasets. In this paper, we study this heterogeneous
task with the help of a few widely used MDS corpora and a suite of
state-of-the-art models. We make an attempt to quantify the quality of
summarization corpus and prescribe a list of points to consider while proposing
a new MDS corpus. Next, we analyze the reason behind the absence of an MDS
system which achieves superior performance across all corpora. We then observe
the extent to which system metrics are influenced, and bias is propagated due
to corpus properties. The scripts to reproduce the experiments in this work are
available at https://github.com/LCS2-IIITD/summarization_bias.git.
Related papers
- The Power of Summary-Source Alignments [62.76959473193149]
Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection.
alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data.
This paper proposes extending the summary-source alignment framework by applying it at the more fine-grained proposition span level.
arXiv Detail & Related papers (2024-06-02T19:35:19Z) - Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles [136.84278943588652]
We propose a new task of summarizing diverse information encountered in multiple news articles encompassing the same event.
To facilitate this task, we outlined a data collection schema for identifying diverse information and curated a dataset named DiverseSumm.
The dataset includes 245 news stories, with each story comprising 10 news articles and paired with a human-validated reference.
arXiv Detail & Related papers (2023-09-17T20:28:17Z) - How "Multi" is Multi-Document Summarization? [15.574673241564932]
It is expected that both reference summaries in MDS datasets, as well as system summaries, would indeed be based on dispersed information.
We propose an automated measure for evaluating the degree to which a summary is disperse''
Our results show that certain MDS datasets barely require combining information from multiple documents, where a single document often covers the full summary content.
arXiv Detail & Related papers (2022-10-23T10:20:09Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - Unsupervised Summarization with Customized Granularities [76.26899748972423]
We propose the first unsupervised multi-granularity summarization framework, GranuSum.
By inputting different numbers of events, GranuSum is capable of producing multi-granular summaries in an unsupervised manner.
arXiv Detail & Related papers (2022-01-29T05:56:35Z) - HowSumm: A Multi-Document Summarization Dataset Derived from WikiHow
Articles [8.53502615629675]
We present HowSumm, a novel large-scale dataset for the task of query-focused multi-document summarization (qMDS)
This use-case is different from the use-cases covered in existing multi-document summarization (MDS) datasets and is applicable to educational and industrial scenarios.
We describe the creation of the dataset and discuss the unique features that distinguish it from other summarization corpora.
arXiv Detail & Related papers (2021-10-07T04:44:32Z) - WSL-DS: Weakly Supervised Learning with Distant Supervision for Query
Focused Multi-Document Abstractive Summarization [16.048329028104643]
In the Query Focused Multi-Document Summarization (QF-MDS) task, a set of documents and a query are given where the goal is to generate a summary from these documents.
One major challenge for this task is the lack of availability of labeled training datasets.
We propose a novel weakly supervised learning approach via utilizing distant supervision.
arXiv Detail & Related papers (2020-11-03T02:02:55Z) - SupMMD: A Sentence Importance Model for Extractive Summarization using
Maximum Mean Discrepancy [92.5683788430012]
SupMMD is a novel technique for generic and update summarization based on the maximum discrepancy from kernel two-sample testing.
We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets.
arXiv Detail & Related papers (2020-10-06T09:26:55Z) - Massive Multi-Document Summarization of Product Reviews with Weak
Supervision [11.462916848094403]
Product reviews summarization is a type of Multi-Document Summarization (MDS) task.
We show that summarizing small samples of the reviews can result in loss of important information.
We propose a schema for summarizing a massive set of reviews on top of a standard summarization algorithm.
arXiv Detail & Related papers (2020-07-22T11:22:57Z) - Overview of the TREC 2019 Fair Ranking Track [65.15263872493799]
The goal of the TREC Fair Ranking track was to develop a benchmark for evaluating retrieval systems in terms of fairness to different content providers.
This paper presents an overview of the track, including the task definition, descriptions of the data and the annotation process.
arXiv Detail & Related papers (2020-03-25T21:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.