OpenAsp: A Benchmark for Multi-document Open Aspect-based Summarization
- URL: http://arxiv.org/abs/2312.04440v1
- Date: Thu, 7 Dec 2023 17:06:20 GMT
- Title: OpenAsp: A Benchmark for Multi-document Open Aspect-based Summarization
- Authors: Shmuel Amar, Liat Schiff, Ori Ernst, Asi Shefer, Ori Shapira and Ido
Dagan
- Abstract summary: We introduce OpenAsp, a benchmark for aspect-based summarization.
We show that the realistic open-aspect setting realized in OpenAsp poses a challenge for current state-of-the-art summarization models.
- Score: 19.079053035229695
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The performance of automatic summarization models has improved dramatically
in recent years. Yet, there is still a gap in meeting specific information
needs of users in real-world scenarios, particularly when a targeted summary is
sought, such as in the useful aspect-based summarization setting targeted in
this paper. Previous datasets and studies for this setting have predominantly
concentrated on a limited set of pre-defined aspects, focused solely on single
document inputs, or relied on synthetic data. To advance research on more
realistic scenarios, we introduce OpenAsp, a benchmark for multi-document
\textit{open} aspect-based summarization. This benchmark is created using a
novel and cost-effective annotation protocol, by which an open aspect dataset
is derived from existing generic multi-document summarization datasets. We
analyze the properties of OpenAsp showcasing its high-quality content. Further,
we show that the realistic open-aspect setting realized in OpenAsp poses a
challenge for current state-of-the-art summarization models, as well as for
large language models.
Related papers
- Beyond Relevant Documents: A Knowledge-Intensive Approach for Query-Focused Summarization using Large Language Models [27.90653125902507]
We propose a knowledge-intensive approach that reframes query-focused summarization as a knowledge-intensive task setup.
The retrieval module efficiently retrieves potentially relevant documents from a large-scale knowledge corpus.
The summarization controller seamlessly integrates a powerful large language model (LLM)-based summarizer with a carefully tailored prompt.
arXiv Detail & Related papers (2024-08-19T18:54:20Z) - Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization [48.57273563299046]
We propose the task of Stepwise Summarization, which aims to generate a new appended summary each time a new document is proposed.
The appended summary should not only summarize the newly added content but also be coherent with the previous summary.
We show that SSG achieves state-of-the-art performance in terms of both automatic metrics and human evaluations.
arXiv Detail & Related papers (2024-06-08T05:37:26Z) - LMGQS: A Large-scale Dataset for Query-focused Summarization [77.6179359525065]
We convert four generic summarization benchmarks into a new QFS benchmark dataset, LMGQS.
We establish baselines with state-of-the-art summarization models.
We achieve state-of-the-art zero-shot and supervised performance on multiple existing QFS benchmarks.
arXiv Detail & Related papers (2023-05-22T14:53:45Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - ACM -- Attribute Conditioning for Abstractive Multi Document
Summarization [0.0]
We propose a model that incorporates attribute conditioning modules in order to decouple conflicting information by conditioning for a certain attribute in the output summary.
This approach shows strong gains in ROUGE score over baseline multi document summarization approaches.
arXiv Detail & Related papers (2022-05-09T00:00:14Z) - Unsupervised Summarization with Customized Granularities [76.26899748972423]
We propose the first unsupervised multi-granularity summarization framework, GranuSum.
By inputting different numbers of events, GranuSum is capable of producing multi-granular summaries in an unsupervised manner.
arXiv Detail & Related papers (2022-01-29T05:56:35Z) - Aspect-Oriented Summarization through Query-Focused Extraction [23.62412515574206]
Real users' needs often fall more closely into aspects, broad topics in a dataset the user is interested in rather than specific queries.
We benchmark extractive query-focused training schemes, and propose a contrastive augmentation approach to train the model.
We evaluate on two aspect-oriented datasets and find this approach yields focused summaries, better than those from a generic summarization system.
arXiv Detail & Related papers (2021-10-15T18:06:21Z) - Text Summarization with Latent Queries [60.468323530248945]
We introduce LaQSum, the first unified text summarization system that learns Latent Queries from documents for abstractive summarization with any existing query forms.
Under a deep generative framework, our system jointly optimize a latent query model and a conditional language model, allowing users to plug-and-play queries of any type at test time.
Our system robustly outperforms strong comparison systems across summarization benchmarks with different query types, document settings, and target domains.
arXiv Detail & Related papers (2021-05-31T21:14:58Z) - WikiAsp: A Dataset for Multi-domain Aspect-based Summarization [69.13865812754058]
We propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization.
Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation.
Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.
arXiv Detail & Related papers (2020-11-16T10:02:52Z) - WSL-DS: Weakly Supervised Learning with Distant Supervision for Query
Focused Multi-Document Abstractive Summarization [16.048329028104643]
In the Query Focused Multi-Document Summarization (QF-MDS) task, a set of documents and a query are given where the goal is to generate a summary from these documents.
One major challenge for this task is the lack of availability of labeled training datasets.
We propose a novel weakly supervised learning approach via utilizing distant supervision.
arXiv Detail & Related papers (2020-11-03T02:02:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.