Efficient text generation of user-defined topic using generative
adversarial networks
- URL: http://arxiv.org/abs/2006.12005v1
- Date: Mon, 22 Jun 2020 04:49:47 GMT
- Title: Efficient text generation of user-defined topic using generative
adversarial networks
- Authors: Chenhan Yuan, Yi-chin Huang and Cheng-Hung Tsai
- Abstract summary: We propose a User-Defined GAN (UD-GAN) with two-level discriminators to solve this problem.
The proposed method is capable of generating texts with less time than others.
- Score: 0.32228025627337864
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study focused on efficient text generation using generative adversarial
networks (GAN). Assuming that the goal is to generate a paragraph of a
user-defined topic and sentimental tendency, conventionally the whole network
has to be re-trained to obtain new results each time when a user changes the
topic. This would be time-consuming and impractical. Therefore, we propose a
User-Defined GAN (UD-GAN) with two-level discriminators to solve this problem.
The first discriminator aims to guide the generator to learn paragraph-level
information and sentence syntactic structure, which is constructed by
multiple-LSTMs. The second one copes with higher-level information, such as the
user-defined sentiment and topic for text generation. The cosine similarity
based on TF-IDF and length penalty are adopted to determine the relevance of
the topic. Then, the second discriminator is re-trained with the generator if
the topic or sentiment for text generation is modified. The system evaluations
are conducted to compare the performance of the proposed method with other
GAN-based ones. The objective results showed that the proposed method is
capable of generating texts with less time than others and the generated text
is related to the user-defined topic and sentiment. We will further investigate
the possibility of incorporating more detailed paragraph information such as
semantics into text generation to enhance the result.
Related papers
- QAEA-DR: A Unified Text Augmentation Framework for Dense Retrieval [12.225881591629815]
In dense retrieval, embedding long texts into dense vectors can result in information loss, leading to inaccurate query-text matching.
Recent studies mainly focus on improving the sentence embedding model or retrieval process.
We introduce a novel text augmentation framework for dense retrieval, which transforms raw documents into information-dense text formats.
arXiv Detail & Related papers (2024-07-29T17:39:08Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge
Selection [71.20871905457174]
Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text.
Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up of irrelevant references.
We present DKGen, which divide the text generation process into an iterative process.
arXiv Detail & Related papers (2023-08-30T02:22:40Z) - Sequentially Controlled Text Generation [97.22539956688443]
GPT-2 generates sentences that are remarkably human-like, longer documents can ramble and do not follow human-like writing structure.
We study the problem of imposing structure on long-range text.
We develop a sequential controlled text generation pipeline with generation and editing.
arXiv Detail & Related papers (2023-01-05T21:23:51Z) - TegFormer: Topic-to-Essay Generation with Good Topic Coverage and High
Text Coherence [8.422108048684215]
We propose a novel approach to topic-to-essay generation called TegFormer.
A emphTopic-Extension layer captures the interaction between the given topics and their domain-specific contexts.
An emphEmbedding-Fusion module combines the domain-specific word embeddings learnt from the given corpus and the general-purpose word embeddings provided by a GPT-2 model pre-trained on massive text data.
arXiv Detail & Related papers (2022-12-27T11:50:14Z) - A survey on text generation using generative adversarial networks [0.0]
This work presents a thorough review concerning recent studies and text generation advancements using Generative Adversarial Networks.
The usage of adversarial learning for text generation is promising as it provides alternatives to generate the so-called "natural" language.
arXiv Detail & Related papers (2022-12-20T17:54:08Z) - RSTGen: Imbuing Fine-Grained Interpretable Control into Long-FormText
Generators [26.27412809287025]
RSTGen is a framework that controls the discourse structure, semantics and topics of generated text.
We demonstrate our model's ability to control structural discourse and semantic features of generated text in open generation evaluation.
arXiv Detail & Related papers (2022-05-25T09:06:04Z) - Data-to-text Generation with Variational Sequential Planning [74.3955521225497]
We consider the task of data-to-text generation, which aims to create textual output from non-linguistic input.
We propose a neural model enhanced with a planning component responsible for organizing high-level information in a coherent and meaningful way.
We infer latent plans sequentially with a structured variational model, while interleaving the steps of planning and generation.
arXiv Detail & Related papers (2022-02-28T13:17:59Z) - A Benchmark Corpus for the Detection of Automatically Generated Text in
Academic Publications [0.02578242050187029]
This paper presents two datasets comprised of artificially generated research content.
In the first case, the content is completely generated by the GPT-2 model after a short prompt extracted from original papers.
The partial or hybrid dataset is created by replacing several sentences of abstracts with sentences that are generated by the Arxiv-NLP model.
We evaluate the quality of the datasets comparing the generated texts to aligned original texts using fluency metrics such as BLEU and ROUGE.
arXiv Detail & Related papers (2022-02-04T08:16:56Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - Select, Extract and Generate: Neural Keyphrase Generation with
Layer-wise Coverage Attention [75.44523978180317]
We propose emphSEG-Net, a neural keyphrase generation model that is composed of two major components.
The experimental results on seven keyphrase generation benchmarks from scientific and web documents demonstrate that SEG-Net outperforms the state-of-the-art neural generative methods by a large margin.
arXiv Detail & Related papers (2020-08-04T18:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.