Related papers: PoBRL: Optimizing Multi-Document Summarization by Blending Reinforcement Learning Policies

PoBRL: Optimizing Multi-Document Summarization by Blending Reinforcement Learning Policies

URL: http://arxiv.org/abs/2105.08244v1
Date: Tue, 18 May 2021 02:55:42 GMT
Title: PoBRL: Optimizing Multi-Document Summarization by Blending Reinforcement Learning Policies
Authors: Andy Su, Difei Su, John M.Mulvey, H.Vincent Poor
Abstract summary: We propose a reinforcement learning based framework PoBRL for solving multi-document summarization. Our strategy decouples this multi-objective optimization into different subproblems that can be solved individually by reinforcement learning. Our empirical analysis shows state-of-the-art performance on several multi-document datasets.
Score: 68.8204255655161
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel reinforcement learning based framework PoBRL for solving multi-document summarization. PoBRL jointly optimizes over the following three objectives necessary for a high-quality summary: importance, relevance, and length. Our strategy decouples this multi-objective optimization into different subproblems that can be solved individually by reinforcement learning. Utilizing PoBRL, we then blend each learned policies together to produce a summary that is a concise and complete representation of the original input. Our empirical analysis shows state-of-the-art performance on several multi-document datasets. Human evaluation also shows that our method produces high-quality output.

Related papers

Co-Reinforcement Learning for Unified Multimodal Understanding and Generation [53.03303124157899]
This paper presents a pioneering exploration of reinforcement learning (RL) via group relative policy optimization for unified multimodal large language models (ULMs)<n>We introduce CoRL, a co-reinforcement learning framework comprising a unified RL stage for joint optimization and a refined RL stage for task-specific enhancement.<n>With the proposed CoRL, our resulting model, ULM-R1, achieves average improvements of 7% on three text-to-image generation datasets and 23% on nine multimodal understanding benchmarks.
arXiv Detail & Related papers (2025-05-23T06:41:07Z)
EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning [6.675088737484839]
We introduce an Ensemble Multi-Objective RL (EMORL) framework that fine-tunes multiple models with individual objectives to improve efficiency and flexibility.<n>Our method is the first to aggregate the hidden states of individual models, incorporating contextual information from multiple objectives.<n>We demonstrate the advantages of EMORL against existing baselines in experiments on the PAIR and Psych8k datasets.
arXiv Detail & Related papers (2025-05-05T11:30:46Z)
Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning [12.083649916114402]
We propose multi-objective reinforcement learning tailored to generate balanced summaries across all four dimensions. Unlike prior ROUGE-based rewards relying on reference summaries, we use a QA-based reward model that aligns with human preferences. Our approach achieved substantial performance gains compared to baseline models on representative summarization datasets.
arXiv Detail & Related papers (2024-06-01T05:15:12Z)
Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning [48.79569442193824]
We show that COMRL algorithms are essentially optimizing the same mutual information objective between the task variable $M$ and its latent representation $Z$ by implementing various approximate bounds. This work lays the information theoretic foundation for COMRL methods, leading to a better understanding of task representation learning in the context of reinforcement learning.
arXiv Detail & Related papers (2024-02-04T09:58:42Z)
PELMS: Pre-training for Effective Low-Shot Multi-Document Summarization [4.6493060043204535]
We present PELMS, a pre-trained model that generates concise, fluent, and faithful summaries. We compile MultiPT, a multi-document pre-training corpus containing over 93 million documents to form more than 3 million unlabeled topic-centric document clusters. Our approach consistently outperforms competitive comparisons with respect to overall informativeness, abstractiveness, coherence, and faithfulness.
arXiv Detail & Related papers (2023-11-16T12:05:23Z)
Unsupervised Multi-document Summarization with Holistic Inference [41.58777650517525]
This paper proposes a new holistic framework for unsupervised multi-document extractive summarization. Subset Representative Index (SRI) balances the importance and diversity of a subset of sentences from the source documents. Our findings suggest that diversity is essential for improving multi-document summary performance.
arXiv Detail & Related papers (2023-09-08T02:56:30Z)
Inverse Reinforcement Learning for Text Summarization [52.765898203824975]
We introduce inverse reinforcement learning (IRL) as an effective paradigm for training abstractive summarization models. Experimental results across datasets in different domains demonstrate the superiority of our proposed IRL model for summarization over MLE and RL baselines.
arXiv Detail & Related papers (2022-12-19T23:45:05Z)
Evaluating and Improving Factuality in Multimodal Abstractive Summarization [91.46015013816083]
We propose CLIPBERTScore to leverage the robustness and strong factuality detection performance between image-summary and document-summary. We show that this simple combination of two metrics in the zero-shot achieves higher correlations than existing factuality metrics for document summarization. Our analysis demonstrates the robustness and high correlation of CLIPBERTScore and its components on four factuality metric-evaluation benchmarks.
arXiv Detail & Related papers (2022-11-04T16:50:40Z)
A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization [11.02198476454955]
We propose fine-tuning an MDS baseline with a reward that balances a reference-based metric with coverage of the input documents. Experimental results over the Multi-News and WCEP MDS datasets show significant improvements of up to +0.95 pp average ROUGE score and +3.17 pp METEOR score over the baseline.
arXiv Detail & Related papers (2022-03-06T07:33:01Z)
Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives. Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process. We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z)
SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression [61.97200991151141]
SummPip is an unsupervised method for multi-document summarization. We convert the original documents to a sentence graph, taking both linguistic and deep representation into account. We then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary.
arXiv Detail & Related papers (2020-07-17T13:01:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.