Cloning Ideology and Style using Deep Learning
- URL: http://arxiv.org/abs/2211.07712v1
- Date: Tue, 25 Oct 2022 11:37:19 GMT
- Title: Cloning Ideology and Style using Deep Learning
- Authors: Dr. Omer Beg, Muhammad Nasir Zafar, Waleed Anjum
- Abstract summary: Research focuses on text generation based on the ideology and style of a specific author, and text generation on a topic that was not written by the same author in the past.
Bi-LSTM model is used to make predictions at the character level, during the training corpus of a specific author is used along with the ground truth corpus.
A pre-trained model is used to identify the sentences of ground truth having contradiction with the author's corpus to make our language model inclined.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text generation tasks have gotten the attention of researchers in the last
few years because of their applications on a large scale.In the past, many
researchers focused on task-based text generations.Our research focuses on text
generation based on the ideology and style of a specific author, and text
generation on a topic that was not written by the same author in the past.Our
trained model requires an input prompt containing initial few words of text to
produce a few paragraphs of text based on the ideology and style of the author
on which the model is trained.Our methodology to accomplish this task is based
on Bi-LSTM.The Bi-LSTM model is used to make predictions at the character
level, during the training corpus of a specific author is used along with the
ground truth corpus.A pre-trained model is used to identify the sentences of
ground truth having contradiction with the author's corpus to make our language
model inclined.During training, we have achieved a perplexity score of 2.23 at
the character level. The experiments show a perplexity score of around 3 over
the test dataset.
Related papers
- Detecting Mode Collapse in Language Models via Narration [0.0]
We study 4,374 stories sampled from three OpenAI language models.
We show successive versions of GPT-3 suffer from increasing degrees of "mode collapse"
Our method and results are significant for researchers seeking to employ language models in sociological simulations.
arXiv Detail & Related papers (2024-02-06T23:52:58Z) - CiteBench: A benchmark for Scientific Citation Text Generation [69.37571393032026]
CiteBench is a benchmark for citation text generation.
We make the code for CiteBench publicly available at https://github.com/UKPLab/citebench.
arXiv Detail & Related papers (2022-12-19T16:10:56Z) - MOCHA: A Multi-Task Training Approach for Coherent Text Generation from
Cognitive Perspective [22.69509556890676]
We propose a novel multi-task training strategy for coherent text generation grounded on the cognitive theory of writing.
We extensively evaluate our model on three open-ended generation tasks including story generation, news article writing and argument generation.
arXiv Detail & Related papers (2022-10-26T11:55:41Z) - Unsupervised Neural Stylistic Text Generation using Transfer learning
and Adapters [66.17039929803933]
We propose a novel transfer learning framework which updates only $0.3%$ of model parameters to learn style specific attributes for response generation.
We learn style specific attributes from the PERSONALITY-CAPTIONS dataset.
arXiv Detail & Related papers (2022-10-07T00:09:22Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - How much do language models copy from their training data? Evaluating
linguistic novelty in text generation using RAVEN [63.79300884115027]
Current language models can generate high-quality text.
Are they simply copying text they have seen before, or have they learned generalizable linguistic abstractions?
We introduce RAVEN, a suite of analyses for assessing the novelty of generated text.
arXiv Detail & Related papers (2021-11-18T04:07:09Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - Quasi Error-free Text Classification and Authorship Recognition in a
large Corpus of English Literature based on a Novel Feature Set [0.0]
We show that in the entire GLEC quasi error-free text classification and authorship recognition is possible with a method using the same set of five style and five content features.
Our data pave the way for many future computational and empirical studies of literature or experiments in reading psychology.
arXiv Detail & Related papers (2020-10-21T07:39:55Z) - Exemplar-Controllable Paraphrasing and Translation using Bitext [57.92051459102902]
We adapt models from prior work to be able to learn solely from bilingual text (bitext)
Our single proposed model can perform four tasks: controlled paraphrase generation in both languages and controlled machine translation in both language directions.
arXiv Detail & Related papers (2020-10-12T17:02:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.