A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document
Summarization
- URL: http://arxiv.org/abs/2305.08503v5
- Date: Wed, 1 Nov 2023 09:18:34 GMT
- Title: A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document
Summarization
- Authors: Chenhui Shen, Liying Cheng, Xuan-Phi Nguyen, Yang You, Lidong Bing
- Abstract summary: Pre-trained language models (PLMs) have achieved outstanding achievements in abstractive single-document summarization (SDS)
We propose a new method to better utilize a PLM to facilitate multi-document interactions for the multi-document summarization (MDS) task.
Our method outperforms its corresponding PLM backbone by up to 3 Rouge-L and is favored by humans.
- Score: 66.08074487429477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models (PLMs) have achieved outstanding achievements in
abstractive single-document summarization (SDS). However, such benefits may not
fully extend to multi-document summarization (MDS), where the handling of
cross-document information is more complex. Previous works either design new
MDS architectures or apply PLMs bluntly with concatenated source documents as a
reformulated SDS task. While the former does not utilize previous pre-training
efforts and may not generalize well across different domains, the latter may
not sufficiently attend to the intricate cross-document relationships unique to
MDS tasks. Instead, we enforce hierarchy on both the encoder and decoder to
better utilize a PLM to facilitate multi-document interactions for the MDS
task. Across 10 MDS benchmarks from various domains, our method outperforms or
is competitive with the previous best models, including those with additional
MDS pre-training or with more parameters. It outperforms its corresponding PLM
backbone by up to 3 Rouge-L and is favored by humans.
Related papers
- TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models [9.232693392690702]
TextHawk is a document-oriented Multimodal Large Language Model (MLLM)
It is designed to explore efficient fine-grained perception by designing four dedicated components.
We conduct extensive experiments on both general and document-oriented MLLM benchmarks, and show that TextHawk outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2024-04-14T09:48:37Z) - Absformer: Transformer-based Model for Unsupervised Multi-Document
Abstractive Summarization [1.066048003460524]
Multi-document summarization (MDS) refers to the task of summarizing the text in multiple documents into a concise summary.
Abstractive MDS aims to generate a coherent and fluent summary for multiple documents using natural language generation techniques.
We propose Absformer, a new Transformer-based method for unsupervised abstractive summary generation.
arXiv Detail & Related papers (2023-06-07T21:18:23Z) - Plug-and-Play Document Modules for Pre-trained Models [92.9897146991974]
We propose to represent each document as a plug-and-play document module, i.e., a document plugin, for PTMs (PlugD)
By inserting document plugins into the backbone PTM for downstream tasks, we can encode a document one time to handle multiple tasks.
Experiments on 8 datasets of 4 typical NLP tasks show that PlugD enables models to encode documents once and for all across different scenarios.
arXiv Detail & Related papers (2023-05-28T08:01:40Z) - Peek Across: Improving Multi-Document Modeling via Cross-Document
Question-Answering [49.85790367128085]
We pre-training a generic multi-document model from a novel cross-document question answering pre-training objective.
This novel multi-document QA formulation directs the model to better recover cross-text informational relations.
Unlike prior multi-document models that focus on either classification or summarization tasks, our pre-training objective formulation enables the model to perform tasks that involve both short text generation and long text generation.
arXiv Detail & Related papers (2023-05-24T17:48:40Z) - PDSum: Prototype-driven Continuous Summarization of Evolving
Multi-document Sets Stream [33.68263291948121]
We propose a new summarization problem, Evolving Multi-Document sets stream Summarization (EMDS)
We introduce a novel unsupervised algorithm PDSum with the idea of prototype-driven continuous summarization.
PDSum builds a lightweight prototype of each multi-document set and exploits it to adapt to new documents.
arXiv Detail & Related papers (2023-02-10T23:43:46Z) - MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are
Better Dense Retrievers [140.0479479231558]
In this work, we aim to unify a variety of pre-training tasks into a multi-task pre-trained model, namely MASTER.
MASTER utilizes a shared-encoder multi-decoder architecture that can construct a representation bottleneck to compress the abundant semantic information across tasks into dense vectors.
arXiv Detail & Related papers (2022-12-15T13:57:07Z) - Document-aware Positional Encoding and Linguistic-guided Encoding for
Abstractive Multi-document Summarization [12.799359904396624]
One key challenge in multi-document summarization is to capture the relations among input documents that distinguish between single document summarization (SDS) and multi-document summarization (MDS)
We propose document-aware positional encoding and linguistic-guided encoding that can be fused with Transformer architecture for MDS.
arXiv Detail & Related papers (2022-09-13T12:22:38Z) - A Multi-Document Coverage Reward for RELAXed Multi-Document
Summarization [11.02198476454955]
We propose fine-tuning an MDS baseline with a reward that balances a reference-based metric with coverage of the input documents.
Experimental results over the Multi-News and WCEP MDS datasets show significant improvements of up to +0.95 pp average ROUGE score and +3.17 pp METEOR score over the baseline.
arXiv Detail & Related papers (2022-03-06T07:33:01Z) - Transfer Learning for Sequence Generation: from Single-source to
Multi-source [50.34044254589968]
We propose a two-stage finetuning method to alleviate the pretrain-finetune discrepancy and introduce a novel MSG model with a fine encoder to learn better representations in MSG tasks.
Our approach achieves new state-of-the-art results on the WMT17 APE task and multi-source translation task using the WMT14 test set.
arXiv Detail & Related papers (2021-05-31T09:12:38Z) - Data Augmentation for Abstractive Query-Focused Multi-Document
Summarization [129.96147867496205]
We present two QMDS training datasets, which we construct using two data augmentation methods.
These two datasets have complementary properties, i.e., QMDSCNN has real summaries but queries are simulated, while QMDSIR has real queries but simulated summaries.
We build end-to-end neural network models on the combined datasets that yield new state-of-the-art transfer results on DUC datasets.
arXiv Detail & Related papers (2021-03-02T16:57:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.