Read Top News First: A Document Reordering Approach for Multi-Document
News Summarization
- URL: http://arxiv.org/abs/2203.10254v1
- Date: Sat, 19 Mar 2022 06:01:11 GMT
- Title: Read Top News First: A Document Reordering Approach for Multi-Document
News Summarization
- Authors: Chao Zhao, Tenghao Huang, Somnath Basu Roy Chowdhury, Muthu Kumar
Chandrasekaran, Kathleen McKeown, Snigdha Chaturvedi
- Abstract summary: We propose a simple approach to reorder the documents according to their relative importance before concatenating and summarizing them.
The reordering makes the salient content easier to learn by the summarization model.
- Score: 27.30854257540805
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A common method for extractive multi-document news summarization is to
re-formulate it as a single-document summarization problem by concatenating all
documents as a single meta-document. However, this method neglects the relative
importance of documents. We propose a simple approach to reorder the documents
according to their relative importance before concatenating and summarizing
them. The reordering makes the salient content easier to learn by the
summarization model. Experiments show that our approach outperforms previous
state-of-the-art methods with more complex architectures.
Related papers
- Unified Multi-Modal Interleaved Document Representation for Information Retrieval [57.65409208879344]
We produce more comprehensive and nuanced document representations by holistically embedding documents interleaved with different modalities.
Specifically, we achieve this by leveraging the capability of recent vision-language models that enable the processing and integration of text, images, and tables into a unified format and representation.
arXiv Detail & Related papers (2024-10-03T17:49:09Z) - Shaping Political Discourse using multi-source News Summarization [0.46040036610482665]
We have developed a machine learning model that generates a concise summary of a topic from multiple news documents.
The model is designed to be unbiased by sampling its input equally from all the different aspects of the topic.
arXiv Detail & Related papers (2023-12-18T21:03:46Z) - Peek Across: Improving Multi-Document Modeling via Cross-Document
Question-Answering [49.85790367128085]
We pre-training a generic multi-document model from a novel cross-document question answering pre-training objective.
This novel multi-document QA formulation directs the model to better recover cross-text informational relations.
Unlike prior multi-document models that focus on either classification or summarization tasks, our pre-training objective formulation enables the model to perform tasks that involve both short text generation and long text generation.
arXiv Detail & Related papers (2023-05-24T17:48:40Z) - Learning Diverse Document Representations with Deep Query Interactions
for Dense Retrieval [79.37614949970013]
We propose a new dense retrieval model which learns diverse document representations with deep query interactions.
Our model encodes each document with a set of generated pseudo-queries to get query-informed, multi-view document representations.
arXiv Detail & Related papers (2022-08-08T16:00:55Z) - ACM -- Attribute Conditioning for Abstractive Multi Document
Summarization [0.0]
We propose a model that incorporates attribute conditioning modules in order to decouple conflicting information by conditioning for a certain attribute in the output summary.
This approach shows strong gains in ROUGE score over baseline multi document summarization approaches.
arXiv Detail & Related papers (2022-05-09T00:00:14Z) - Large-Scale Multi-Document Summarization with Information Extraction and
Compression [31.601707033466766]
We develop an abstractive summarization framework independent of labeled data for multiple heterogeneous documents.
Our framework processes documents telling different stories instead of documents on the same topic.
Our experiments demonstrate that our framework outperforms current state-of-the-art methods in this more generic setting.
arXiv Detail & Related papers (2022-05-01T19:49:15Z) - Multi-View Document Representation Learning for Open-Domain Dense
Retrieval [87.11836738011007]
This paper proposes a multi-view document representation learning framework.
It aims to produce multi-view embeddings to represent documents and enforce them to align with different queries.
Experiments show our method outperforms recent works and achieves state-of-the-art results.
arXiv Detail & Related papers (2022-03-16T03:36:38Z) - Value Retrieval with Arbitrary Queries for Form-like Documents [50.5532781148902]
We propose value retrieval with arbitrary queries for form-like documents.
Our method predicts target value for an arbitrary query based on the understanding of layout and semantics of a form.
We propose a simple document language modeling (simpleDLM) strategy to improve document understanding on large-scale model pre-training.
arXiv Detail & Related papers (2021-12-15T01:12:02Z) - Modeling Endorsement for Multi-Document Abstractive Summarization [10.166639983949887]
A crucial difference between single- and multi-document summarization is how salient content manifests itself in the document(s)
In this paper, we model the cross-document endorsement effect and its utilization in multiple document summarization.
Our method generates a synopsis from each document, which serves as an endorser to identify salient content from other documents.
arXiv Detail & Related papers (2021-10-15T03:55:42Z) - Leveraging Graph to Improve Abstractive Multi-Document Summarization [50.62418656177642]
We develop a neural abstractive multi-document summarization (MDS) model which can leverage well-known graph representations of documents.
Our model utilizes graphs to encode documents in order to capture cross-document relations, which is crucial to summarizing long documents.
Our model can also take advantage of graphs to guide the summary generation process, which is beneficial for generating coherent and concise summaries.
arXiv Detail & Related papers (2020-05-20T13:39:47Z) - A Divide-and-Conquer Approach to the Summarization of Long Documents [4.863209463405628]
We present a novel divide-and-conquer method for the neural summarization of long documents.
Our method exploits the discourse structure of the document and uses sentence similarity to split the problem into smaller summarization problems.
We demonstrate that this approach paired with different summarization models, including sequence-to-sequence RNNs and Transformers, can lead to improved summarization performance.
arXiv Detail & Related papers (2020-04-13T20:38:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.