Related papers: OptAGAN: Entropy-based finetuning on text VAE-GAN

OptAGAN: Entropy-based finetuning on text VAE-GAN

URL: http://arxiv.org/abs/2109.00239v1
Date: Wed, 1 Sep 2021 08:23:19 GMT
Title: OptAGAN: Entropy-based finetuning on text VAE-GAN
Authors: Paolo Tirotta and Stefano Lodi
Abstract summary: Recently Optimus, a variational autoencoder (VAE) has been released. It combines two pre-trained models, BERT and GPT-2. It has been shown to produce novel, yet very human-looking text.
Score: 1.941730292017383
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transfer learning through large pre-trained models has changed the landscape of current applications in natural language processing (NLP). Recently Optimus, a variational autoencoder (VAE) which combines two pre-trained models, BERT and GPT-2, has been released, and its combination with generative adversial networks (GANs) has been shown to produce novel, yet very human-looking text. The Optimus and GANs combination avoids the troublesome application of GANs to the discrete domain of text, and prevents the exposure bias of standard maximum likelihood methods. We combine the training of GANs in the latent space, with the finetuning of the decoder of Optimus for single word generation. This approach lets us model both the high-level features of the sentences, and the low-level word-by-word generation. We finetune using reinforcement learning (RL) by exploiting the structure of GPT-2 and by adding entropy-based intrinsically motivated rewards to balance between quality and diversity. We benchmark the results of the VAE-GAN model, and show the improvements brought by our RL finetuning on three widely used datasets for text generation, with results that greatly surpass the current state-of-the-art for the quality of the generated texts.

Related papers

Integrating Textual Embeddings from Contrastive Learning with Generative Recommender for Enhanced Personalization [8.466223794246261]
We propose a hybrid framework that augments the generative recommender with contrastive text embedding model. We evaluate our method on two domains from the Amazon Reviews 2023 dataset.
arXiv Detail & Related papers (2025-04-13T15:23:00Z)
A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation [5.930799903736776]
This research introduces a novel text generation model that combines BERT's semantic interpretation strengths with GPT-4's generative capabilities. The model enhances semantic depth and maintains smooth, human-like text flow, overcoming limitations seen in prior models.
arXiv Detail & Related papers (2024-11-19T01:41:56Z)
Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning [0.0]
We propose a fine-tuning frame-work that leverages. Efficient Fine-Tuning (PEFT) techniques. We demonstrate that the proposed fine-tuning framework has the potential to improve code-text retrieval performance by tuning only 0.4% parameters at most.
arXiv Detail & Related papers (2024-05-07T08:50:25Z)
LlaMaVAE: Guiding Large Language Model Generation via Continuous Latent Sentence Spaces [1.529963465178546]
We present LlaMaVAE, which combines expressive encoder and decoder models (sentenceT5 and LlaMA) with a VAE architecture to provide better text generation control to large language models (LLMs) Experimental results reveal that LlaMaVAE can outperform the previous state-of-the-art VAE language model, Optimus, across various tasks.
arXiv Detail & Related papers (2023-12-20T17:25:23Z)
BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained Transformer [77.28871523946418]
BatGPT is a large-scale language model designed and trained jointly by Wuhan University and Shanghai Jiao Tong University. It is capable of generating highly natural and fluent text in response to various types of input, including text prompts, images, and audio.
arXiv Detail & Related papers (2023-07-01T15:10:01Z)
Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z)
Leveraging Pre-trained Models for Failure Analysis Triplets Generation [0.0]
We leverage the attention mechanism of pre-trained causal language models such as Transformer model for the downstream task of generating Failure Analysis Triplets (FATs) We observe that Generative Pre-trained Transformer 2 (GPT2) outperformed other transformer model for the failure analysis triplet generation (FATG) task. In particular, we observe that GPT2 (trained on 1.5B parameters) outperforms pre-trained BERT, BART and GPT3 by a large margin on ROUGE.
arXiv Detail & Related papers (2022-10-31T17:21:15Z)
Factorized Neural Transducer for Efficient Language Model Adaptation [51.81097243306204]
We propose a novel model, factorized neural Transducer, by factorizing the blank and vocabulary prediction. It is expected that this factorization can transfer the improvement of the standalone language model to the Transducer for speech recognition. We demonstrate that the proposed factorized neural Transducer yields 15% to 20% WER improvements when out-of-domain text data is used for language model adaptation.
arXiv Detail & Related papers (2021-09-27T15:04:00Z)
Cycle-Consistent Inverse GAN for Text-to-Image Synthesis [101.97397967958722]
We propose a novel unified framework of Cycle-consistent Inverse GAN for both text-to-image generation and text-guided image manipulation tasks. We learn a GAN inversion model to convert the images back to the GAN latent space and obtain the inverted latent codes for each image. In the text-guided optimization module, we generate images with the desired semantic attributes by optimizing the inverted latent codes.
arXiv Detail & Related papers (2021-08-03T08:38:16Z)
Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes. An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z)
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space [109.79957125584252]
Variational Autoencoder (VAE) can be both a powerful generative model and an effective representation learning framework for natural language. In this paper, we propose the first large-scale language VAE model, Optimus.
arXiv Detail & Related papers (2020-04-05T06:20:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.