Learning from Bootstrapping and Stepwise Reinforcement Reward: A
Semi-Supervised Framework for Text Style Transfer
- URL: http://arxiv.org/abs/2205.09324v1
- Date: Thu, 19 May 2022 05:18:06 GMT
- Title: Learning from Bootstrapping and Stepwise Reinforcement Reward: A
Semi-Supervised Framework for Text Style Transfer
- Authors: Zhengyuan Liu, Nancy F. Chen
- Abstract summary: We propose a semi-supervised framework for text style transfer.
First, the learning process is bootstrapped with supervision guided by automatically constructed pseudo-parallel pairs.
Then the model learns from unlabeled data via reinforcement rewards.
- Score: 30.622772801446132
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text style transfer is an important task in controllable language generation.
Supervised approaches have pushed performance improvement on style-oriented
rewriting such as formality conversion. However, challenges remain due to the
scarcity of large-scale parallel data in many domains. While unsupervised
approaches do not rely on annotated sentence pairs for each style, they are
often plagued with instability issues such as mode collapse or quality
degradation. To take advantage of both supervised and unsupervised paradigms
and tackle the challenges, in this work, we propose a semi-supervised framework
for text style transfer. First, the learning process is bootstrapped with
supervision guided by automatically constructed pseudo-parallel pairs using
lexical and semantic-based methods. Then the model learns from unlabeled data
via reinforcement rewards. Specifically, we propose to improve the
sequence-to-sequence policy gradient via stepwise reward optimization,
providing fine-grained learning signals and stabilizing the reinforced learning
process. Experimental results show that the proposed approach achieves
state-of-the-art performance on multiple datasets, and produces effective
generation with as minimal as 10\% of training data.
Related papers
- Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness [3.2925222641796554]
"pointer-guided segment ordering" (SO) is a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations.
Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures.
arXiv Detail & Related papers (2024-06-06T15:17:51Z) - Sequential Visual and Semantic Consistency for Semi-supervised Text
Recognition [56.968108142307976]
Scene text recognition (STR) is a challenging task that requires large-scale annotated data for training.
Most existing STR methods resort to synthetic data, which may introduce domain discrepancy and degrade the performance of STR models.
This paper proposes a novel semi-supervised learning method for STR that incorporates word-level consistency regularization from both visual and semantic aspects.
arXiv Detail & Related papers (2024-02-24T13:00:54Z) - Prefix-Tuning Based Unsupervised Text Style Transfer [29.86587278794342]
Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content.
In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer.
arXiv Detail & Related papers (2023-10-23T06:13:08Z) - Text Revision by On-the-Fly Representation Optimization [76.11035270753757]
Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems.
We present an iterative in-place editing approach for text revision, which requires no parallel data.
It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification.
arXiv Detail & Related papers (2022-04-15T07:38:08Z) - $\textit{latent}$-GLAT: Glancing at Latent Variables for Parallel Text
Generation [65.29170569821093]
parallel text generation has received widespread attention due to its success in generation efficiency.
In this paper, we propose $textitlatent$-GLAT, which employs the discrete latent variables to capture word categorical information.
Experiment results show that our method outperforms strong baselines without the help of an autoregressive model.
arXiv Detail & Related papers (2022-04-05T07:34:12Z) - Gradient-guided Unsupervised Text Style Transfer via Contrastive
Learning [6.799826701166569]
We propose a gradient-guided model through a contrastive paradigm for text style transfer, to explicitly gather similar semantic sentences.
Experiments on two datasets show the effectiveness of our proposed approach, as compared to the state-of-the-arts.
arXiv Detail & Related papers (2022-01-23T12:45:00Z) - Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial
Attack Framework [17.17479625646699]
We propose a unified framework to craft textual adversarial samples.
In this paper, we instantiate our framework with an attack algorithm named Textual Projected Gradient Descent (T-PGD)
arXiv Detail & Related papers (2021-10-28T17:31:51Z) - Learning to Selectively Learn for Weakly-supervised Paraphrase
Generation [81.65399115750054]
We propose a novel approach to generate high-quality paraphrases with weak supervision data.
Specifically, we tackle the weakly-supervised paraphrase generation problem by:.
obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion.
We demonstrate that our approach achieves significant improvements over existing unsupervised approaches, and is even comparable in performance with supervised state-of-the-arts.
arXiv Detail & Related papers (2021-09-25T23:31:13Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.