Controlled Text Generation using T5 based Encoder-Decoder Soft Prompt
Tuning and Analysis of the Utility of Generated Text in AI
- URL: http://arxiv.org/abs/2212.02924v1
- Date: Tue, 6 Dec 2022 12:31:53 GMT
- Title: Controlled Text Generation using T5 based Encoder-Decoder Soft Prompt
Tuning and Analysis of the Utility of Generated Text in AI
- Authors: Damith Chamalke Senadeera, Julia Ive
- Abstract summary: We introduce the novel soft prompt tuning method of using soft prompts at both encoder and decoder levels together in a T5 model.
We also investigate the feasibility of steering the output of this extended soft prompted T5 model at decoder level.
- Score: 2.381686610905853
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Controlled text generation is a very important task in the arena of natural
language processing due to its promising applications. In order to achieve this
task we mainly introduce the novel soft prompt tuning method of using soft
prompts at both encoder and decoder levels together in a T5 model and
investigate the performance as the behaviour of an additional soft prompt
related to the decoder of a T5 model in controlled text generation remained
unexplored. Then we also investigate the feasibility of steering the output of
this extended soft prompted T5 model at decoder level and finally analyse the
utility of generated text to be used in AI related tasks such as training AI
models with an interpretability analysis of the classifier trained with
synthetic text, as there is a lack of proper analysis of methodologies in
generating properly labelled data to be utilized in AI tasks. Through the
performed in-depth intrinsic and extrinsic evaluations of this generation model
along with the artificially generated data, we found that this model produced
better results compared to the T5 model with a single soft prompt at encoder
level and the sentiment classifier trained using this artificially generated
data can produce comparable classification results to the results of a
classifier trained with real labelled data and also the classifier decision is
interpretable with respect to the input text content.
Related papers
- Text2Data: Low-Resource Data Generation with Textual Control [104.38011760992637]
Natural language serves as a common and straightforward control signal for humans to interact seamlessly with machines.
We propose Text2Data, a novel approach that utilizes unlabeled data to understand the underlying data distribution through an unsupervised diffusion model.
It undergoes controllable finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting.
arXiv Detail & Related papers (2024-02-08T03:41:39Z) - A Simple yet Efficient Ensemble Approach for AI-generated Text Detection [0.5840089113969194]
Large Language Models (LLMs) have demonstrated remarkable capabilities in generating text that closely resembles human writing.
It is essential to build automated approaches capable of distinguishing between artificially generated text and human-authored text.
We propose a simple yet efficient solution by ensembling predictions from multiple constituent LLMs.
arXiv Detail & Related papers (2023-11-06T13:11:02Z) - Exploring Automatic Evaluation Methods based on a Decoder-based LLM for
Text Generation [16.78350863261211]
This paper compares various methods, including tuning with encoder-based models and large language models under equal conditions.
Experimental results show that compared to the tuned encoder-based models, the tuned decoder-based models perform poorly.
It is also revealed that in-context learning of very large decoder-based models such as ChatGPT makes it difficult to identify fine-grained semantic differences.
arXiv Detail & Related papers (2023-10-17T06:53:00Z) - Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge
Selection [71.20871905457174]
Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text.
Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up of irrelevant references.
We present DKGen, which divide the text generation process into an iterative process.
arXiv Detail & Related papers (2023-08-30T02:22:40Z) - Code-Switching Text Generation and Injection in Mandarin-English ASR [57.57570417273262]
We investigate text generation and injection for improving the performance of an industry commonly-used streaming model, Transformer-Transducer (T-T)
We first propose a strategy to generate code-switching text data and then investigate injecting generated text into T-T model explicitly by Text-To-Speech (TTS) conversion or implicitly by tying speech and text latent spaces.
Experimental results on the T-T model trained with a dataset containing 1,800 hours of real Mandarin-English code-switched speech show that our approaches to inject generated code-switching text significantly boost the performance of T-T models.
arXiv Detail & Related papers (2023-03-20T09:13:27Z) - On the Reliability and Explainability of Language Models for Program
Generation [15.569926313298337]
We study the capabilities and limitations of automated program generation approaches.
We employ advanced explainable AI approaches to highlight the tokens that significantly contribute to the code transformation.
Our analysis reveals that, in various experimental scenarios, language models can recognize code grammar and structural information, but they exhibit limited robustness to changes in input sequences.
arXiv Detail & Related papers (2023-02-19T14:59:52Z) - Attention Is Indeed All You Need: Semantically Attention-Guided Decoding
for Data-to-Text NLG [0.913755431537592]
We propose a novel decoding method that extracts interpretable information from encoder-decoder models' cross-attention.
We show on three datasets its ability to dramatically reduce semantic errors in the generated outputs.
arXiv Detail & Related papers (2021-09-15T01:42:51Z) - Do Encoder Representations of Generative Dialogue Models Encode
Sufficient Information about the Task ? [41.36218215755317]
We showcase evaluating the text generated through human or automatic metrics is not sufficient to appropriately evaluate soundness of the language understanding of dialogue models.
We propose a set of probe tasks to evaluate encoder representation of different language encoders commonly used in dialogue models.
arXiv Detail & Related papers (2021-06-20T04:52:37Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.