Unsupervised Text Generation by Learning from Search
- URL: http://arxiv.org/abs/2007.08557v1
- Date: Thu, 9 Jul 2020 04:34:48 GMT
- Title: Unsupervised Text Generation by Learning from Search
- Authors: Jingjing Li, Zichao Li, Lili Mou, Xin Jiang, Michael R. Lyu, Irwin
King
- Abstract summary: TGLS is a novel framework to unsupervised Text Generation by Learning.
We demonstrate the effectiveness of TGLS on two real-world natural language generation tasks, paraphrase generation and text formalization.
- Score: 86.51619839836331
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we present TGLS, a novel framework to unsupervised Text
Generation by Learning from Search. We start by applying a strong search
algorithm (in particular, simulated annealing) towards a heuristically defined
objective that (roughly) estimates the quality of sentences. Then, a
conditional generative model learns from the search results, and meanwhile
smooth out the noise of search. The alternation between search and learning can
be repeated for performance bootstrapping. We demonstrate the effectiveness of
TGLS on two real-world natural language generation tasks, paraphrase generation
and text formalization. Our model significantly outperforms unsupervised
baseline methods in both tasks. Especially, it achieves comparable performance
with the state-of-the-art supervised methods in paraphrase generation.
Related papers
- Reinforcement Learning with Token-level Feedback for Controllable Text Generation [16.117006822479407]
We propose a novel reinforcement learning algorithm named TOLE which formulates TOken-LEvel rewards for controllable text generation.
Experimental results show that our algorithm can achieve superior performance on both single-attribute and multi-attribute control tasks.
arXiv Detail & Related papers (2024-03-18T08:18:37Z) - Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents.
Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z) - Search and Learning for Unsupervised Text Generation [27.940118426945872]
In this paper, I will introduce our recent work on search and learning approaches to unsupervised text generation.
A machine learning model further learns from the search results to smooth out noise and improve efficiency.
arXiv Detail & Related papers (2023-09-18T05:44:11Z) - Learning to Rank in Generative Retrieval [62.91492903161522]
Generative retrieval aims to generate identifier strings of relevant passages as the retrieval target.
We propose a learning-to-rank framework for generative retrieval, dubbed LTRGR.
This framework only requires an additional learning-to-rank training phase to enhance current generative retrieval systems.
arXiv Detail & Related papers (2023-06-27T05:48:14Z) - A Survey on Retrieval-Augmented Text Generation [53.04991859796971]
Retrieval-augmented text generation has remarkable advantages and has achieved state-of-the-art performance in many NLP tasks.
It firstly highlights the generic paradigm of retrieval-augmented generation, and then it reviews notable approaches according to different tasks.
arXiv Detail & Related papers (2022-02-02T16:18:41Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - ColdGANs: Taming Language GANs with Cautious Sampling Strategies [29.943949944682196]
Generative Adversarial Networks (GANs) can mitigate limitations but the discrete nature of text has hindered their application to language generation.
We show how classical sampling results in unstable training.
We propose to consider alternative exploration strategies in a GAN framework that we name ColdGANs, where we force the sampling to be close to the distribution modes to get smoother learning dynamics.
For the first time, to the best of our knowledge, the proposed language GANs compare favorably to MLE, and obtain improvements over the state-of-the-art on three generative tasks.
arXiv Detail & Related papers (2020-06-08T14:48:14Z) - QURIOUS: Question Generation Pretraining for Text Generation [13.595014409069584]
We propose question generation as a pretraining method, which better aligns with the text generation objectives.
Our text generation models pretrained with this method are better at understanding the essence of the input and are better language models for the target task.
arXiv Detail & Related papers (2020-04-23T08:41:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.