Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning
- URL: http://arxiv.org/abs/2406.00303v1
- Date: Sat, 1 Jun 2024 05:15:12 GMT
- Title: Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning
- Authors: Sangwon Ryu, Heejin Do, Yunsu Kim, Gary Geunbae Lee, Jungseul Ok,
- Abstract summary: We propose multi-objective reinforcement learning tailored to generate balanced summaries across all four dimensions.
Unlike prior ROUGE-based rewards relying on reference summaries, we use a QA-based reward model that aligns with human preferences.
Our approach achieved substantial performance gains compared to baseline models on representative summarization datasets.
- Score: 12.083649916114402
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The evaluation of summary quality encompasses diverse dimensions such as consistency, coherence, relevance, and fluency. However, existing summarization methods often target a specific dimension, facing challenges in generating well-balanced summaries across multiple dimensions. In this paper, we propose multi-objective reinforcement learning tailored to generate balanced summaries across all four dimensions. We introduce two multi-dimensional optimization (MDO) strategies for adaptive learning: 1) MDO_min, rewarding the current lowest dimension score, and 2) MDO_pro, optimizing multiple dimensions similar to multi-task learning, resolves conflicting gradients across dimensions through gradient projection. Unlike prior ROUGE-based rewards relying on reference summaries, we use a QA-based reward model that aligns with human preferences. Further, we discover the capability to regulate the length of summaries by adjusting the discount factor, seeking the generation of concise yet informative summaries that encapsulate crucial points. Our approach achieved substantial performance gains compared to baseline models on representative summarization datasets, particularly in the overlooked dimensions.
Related papers
- MODABS: Multi-Objective Learning for Dynamic Aspect-Based Summarization [29.111115148808196]
We introduce a novel multi-objective learning framework employing a Longformer-Encoder-Decoder for this task.
We show our method significantly outperforms baselines on three diverse datasets.
arXiv Detail & Related papers (2024-06-05T17:32:28Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Multiform Evolution for High-Dimensional Problems with Low Effective
Dimensionality [36.44425198302701]
We scale evolutionary algorithms to high-dimensional optimization problems that deceptively possess a low effective dimensionality.
A multiform evolutionary algorithm is developed for unifying all formulations into a single multi-task setting.
The resultant joint optimization enables the target task to efficiently reuse solutions evolved across various low-dimensional searches.
arXiv Detail & Related papers (2023-12-30T08:13:47Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - Online Multi-Task Learning with Recursive Least Squares and Recursive Kernel Methods [50.67996219968513]
We introduce two novel approaches for Online Multi-Task Learning (MTL) Regression Problems.
We achieve exact and approximate recursions with quadratic per-instance cost on the dimension of the input space.
We compare our online MTL methods to other contenders in a real-world wind speed forecasting case study.
arXiv Detail & Related papers (2023-08-03T01:41:34Z) - A Unified Model and Dimension for Interactive Estimation [20.39351301232109]
We introduce a measure called dissimilarity dimension which largely captures learnability in our model.
We show that our framework subsumes and unifies two classic learning models: statistical-query learning and structured bandits.
arXiv Detail & Related papers (2023-06-09T18:21:04Z) - Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of
Semantics and Depth [83.94528876742096]
We tackle the MTL problem of two dense tasks, ie, semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module (CCAM)
In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called AffineMix, and a simple depth augmentation using predicted semantics called ColorAug.
Finally, we validate the performance gain of the proposed method on the Cityscapes dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic
arXiv Detail & Related papers (2022-06-21T17:40:55Z) - PoBRL: Optimizing Multi-Document Summarization by Blending Reinforcement
Learning Policies [68.8204255655161]
We propose a reinforcement learning based framework PoBRL for solving multi-document summarization.
Our strategy decouples this multi-objective optimization into different subproblems that can be solved individually by reinforcement learning.
Our empirical analysis shows state-of-the-art performance on several multi-document datasets.
arXiv Detail & Related papers (2021-05-18T02:55:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.