SupMMD: A Sentence Importance Model for Extractive Summarization using
Maximum Mean Discrepancy
- URL: http://arxiv.org/abs/2010.02568v1
- Date: Tue, 6 Oct 2020 09:26:55 GMT
- Title: SupMMD: A Sentence Importance Model for Extractive Summarization using
Maximum Mean Discrepancy
- Authors: Umanga Bista, Alexander Patrick Mathews, Aditya Krishna Menon, Lexing
Xie
- Abstract summary: SupMMD is a novel technique for generic and update summarization based on the maximum discrepancy from kernel two-sample testing.
We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets.
- Score: 92.5683788430012
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most work on multi-document summarization has focused on generic
summarization of information present in each individual document set. However,
the under-explored setting of update summarization, where the goal is to
identify the new information present in each set, is of equal practical
interest (e.g., presenting readers with updates on an evolving news topic). In
this work, we present SupMMD, a novel technique for generic and update
summarization based on the maximum mean discrepancy from kernel two-sample
testing. SupMMD combines both supervised learning for salience and unsupervised
learning for coverage and diversity. Further, we adapt multiple kernel learning
to make use of similarity across multiple information sources (e.g., text
features and knowledge based concepts). We show the efficacy of SupMMD in both
generic and update summarization tasks by meeting or exceeding the current
state-of-the-art on the DUC-2004 and TAC-2009 datasets.
Related papers
- Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization [48.57273563299046]
We propose the task of Stepwise Summarization, which aims to generate a new appended summary each time a new document is proposed.
The appended summary should not only summarize the newly added content but also be coherent with the previous summary.
We show that SSG achieves state-of-the-art performance in terms of both automatic metrics and human evaluations.
arXiv Detail & Related papers (2024-06-08T05:37:26Z) - Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles [136.84278943588652]
We propose a new task of summarizing diverse information encountered in multiple news articles encompassing the same event.
To facilitate this task, we outlined a data collection schema for identifying diverse information and curated a dataset named DiverseSumm.
The dataset includes 245 news stories, with each story comprising 10 news articles and paired with a human-validated reference.
arXiv Detail & Related papers (2023-09-17T20:28:17Z) - Align and Attend: Multimodal Summarization with Dual Contrastive Losses [57.83012574678091]
The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries.
Existing methods fail to leverage the temporal correspondence between different modalities and ignore the intrinsic correlation between different samples.
We introduce Align and Attend Multimodal Summarization (A2Summ), a unified multimodal transformer-based model which can effectively align and attend the multimodal input.
arXiv Detail & Related papers (2023-03-13T17:01:42Z) - UniSumm and SummZoo: Unified Model and Diverse Benchmark for Few-Shot
Summarization [54.59104881168188]
textscUniSumm is a unified few-shot summarization model pre-trained with multiple summarization tasks.
textscSummZoo is a new benchmark to better evaluate few-shot summarizers.
arXiv Detail & Related papers (2022-11-17T18:54:47Z) - How "Multi" is Multi-Document Summarization? [15.574673241564932]
It is expected that both reference summaries in MDS datasets, as well as system summaries, would indeed be based on dispersed information.
We propose an automated measure for evaluating the degree to which a summary is disperse''
Our results show that certain MDS datasets barely require combining information from multiple documents, where a single document often covers the full summary content.
arXiv Detail & Related papers (2022-10-23T10:20:09Z) - Guided Exploration of Data Summaries [24.16170440895994]
A useful summary contains k individually uniform sets that are collectively diverse to be representative.
Finding such as summary is a difficult task when data is highly diverse and large.
We examine the applicability of Exploratory Data Analysis (EDA) to data summarization and formalize Eda4Sum.
arXiv Detail & Related papers (2022-05-27T13:06:27Z) - Unsupervised Summarization with Customized Granularities [76.26899748972423]
We propose the first unsupervised multi-granularity summarization framework, GranuSum.
By inputting different numbers of events, GranuSum is capable of producing multi-granular summaries in an unsupervised manner.
arXiv Detail & Related papers (2022-01-29T05:56:35Z) - HowSumm: A Multi-Document Summarization Dataset Derived from WikiHow
Articles [8.53502615629675]
We present HowSumm, a novel large-scale dataset for the task of query-focused multi-document summarization (qMDS)
This use-case is different from the use-cases covered in existing multi-document summarization (MDS) datasets and is applicable to educational and industrial scenarios.
We describe the creation of the dataset and discuss the unique features that distinguish it from other summarization corpora.
arXiv Detail & Related papers (2021-10-07T04:44:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.