Q-learning with Language Model for Edit-based Unsupervised Summarization
- URL: http://arxiv.org/abs/2010.04379v1
- Date: Fri, 9 Oct 2020 05:47:00 GMT
- Title: Q-learning with Language Model for Edit-based Unsupervised Summarization
- Authors: Ryosuke Kohita, Akifumi Wachi, Yang Zhao, Ryuki Tachibana
- Abstract summary: We propose a new approach based on Q-learning with an edit-based summarization.
The method combines two key modules to form an Editorial Agent and Language Model converter.
Q-learning is leveraged to train the agent to produce proper edit actions.
- Score: 19.332743860240264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised methods are promising for abstractive text summarization in that
the parallel corpora is not required. However, their performance is still far
from being satisfied, therefore research on promising solutions is on-going. In
this paper, we propose a new approach based on Q-learning with an edit-based
summarization. The method combines two key modules to form an Editorial Agent
and Language Model converter (EALM). The agent predicts edit actions (e.t.,
delete, keep, and replace), and then the LM converter deterministically
generates a summary on the basis of the action signals. Q-learning is leveraged
to train the agent to produce proper edit actions. Experimental results show
that EALM delivered competitive performance compared with the previous
encoder-decoder-based methods, even with truly zero paired data (i.e., no
validation set). Defining the task as Q-learning enables us not only to develop
a competitive method but also to make the latest techniques in reinforcement
learning available for unsupervised summarization. We also conduct qualitative
analysis, providing insights into future study on unsupervised summarizers.
Related papers
- Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - Recursive Introspection: Teaching Language Model Agents How to Self-Improve [30.086494067593268]
We develop RISE: Recursive IntroSpEction, an approach for fine-tuning large language models.
Our experiments show that RISE enables Llama2, Llama3, and Mistral models to improve themselves with more turns on math reasoning tasks.
arXiv Detail & Related papers (2024-07-25T17:35:59Z) - Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z) - Learning How to Infer Partial MDPs for In-Context Adaptation and
Exploration [17.27164535440641]
Posterior sampling is a promising approach, but it requires Bayesian inference and dynamic programming.
We show that even though partial models exclude relevant information from the environment, they can nevertheless lead to good policies.
arXiv Detail & Related papers (2023-02-08T18:35:24Z) - Retrieval as Attention: End-to-end Learning of Retrieval and Reading
within a Single Transformer [80.50327229467993]
We show that a single model trained end-to-end can achieve both competitive retrieval and QA performance.
We show that end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings.
arXiv Detail & Related papers (2022-12-05T04:51:21Z) - Learning Non-Autoregressive Models from Search for Unsupervised Sentence
Summarization [20.87460375478907]
Text summarization aims to generate a short summary for an input text.
In this work, we propose a Non-Autoregressive Unsupervised Summarization approach.
Experiments show that NAUS achieves state-of-the-art performance for unsupervised summarization.
arXiv Detail & Related papers (2022-05-28T21:09:23Z) - An Imitation Learning Curriculum for Text Editing with
Non-Autoregressive Models [22.996178360362734]
We show that imitation learning algorithms for machine translation introduce mismatches between training and inference that lead to undertraining and poor generalization in editing scenarios.
We show the efficacy of these strategies on two challenging English editing tasks: controllable text simplification and abstractive summarization.
arXiv Detail & Related papers (2022-03-17T17:36:23Z) - elBERto: Self-supervised Commonsense Learning for Question Answering [131.51059870970616]
We propose a Self-supervised Bidirectional Representation Learning of Commonsense framework, which is compatible with off-the-shelf QA model architectures.
The framework comprises five self-supervised tasks to force the model to fully exploit the additional training signals from contexts containing rich commonsense.
elBERto achieves substantial improvements on out-of-paragraph and no-effect questions where simple lexical similarity comparison does not help.
arXiv Detail & Related papers (2022-03-17T16:23:45Z) - Towards Model-informed Precision Dosing with Expert-in-the-loop Machine
Learning [0.0]
We consider a ML framework that may accelerate model learning and improve its interpretability by incorporating human experts into the model learning loop.
We propose a novel human-in-the-loop ML framework aimed at dealing with learning problems that the cost of data annotation is high.
With an application to precision dosing, our experimental results show that the approach can learn interpretable rules from data and may potentially lower experts' workload.
arXiv Detail & Related papers (2021-06-28T03:45:09Z) - Few-Shot Learning for Opinion Summarization [117.70510762845338]
Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents.
In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text.
Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
arXiv Detail & Related papers (2020-04-30T15:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.