Text Detoxification using Large Pre-trained Neural Models
- URL: http://arxiv.org/abs/2109.08914v1
- Date: Sat, 18 Sep 2021 11:55:32 GMT
- Title: Text Detoxification using Large Pre-trained Neural Models
- Authors: David Dale, Anton Voronov, Daryna Dementieva, Varvara Logacheva, Olga
Kozlova, Nikita Semenov and Alexander Panchenko
- Abstract summary: We present two novel unsupervised methods for eliminating toxicity in text.
First method combines guidance of the generation process with small style-conditional language models.
Second method uses BERT to replace toxic words with their non-offensive synonyms.
- Score: 57.72086777177844
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present two novel unsupervised methods for eliminating toxicity in text.
Our first method combines two recent ideas: (1) guidance of the generation
process with small style-conditional language models and (2) use of
paraphrasing models to perform style transfer. We use a well-performing
paraphraser guided by style-trained language models to keep the text content
and remove toxicity. Our second method uses BERT to replace toxic words with
their non-offensive synonyms. We make the method more flexible by enabling BERT
to replace mask tokens with a variable number of words. Finally, we present the
first large-scale comparative study of style transfer models on the task of
toxicity removal. We compare our models with a number of methods for style
transfer. The models are evaluated in a reference-free way using a combination
of unsupervised style transfer metrics. Both methods we suggest yield new SOTA
results.
Related papers
- Unsupervised Text Style Transfer via LLMs and Attention Masking with
Multi-way Interactions [18.64326057581588]
Unsupervised Text Style Transfer (UTST) has emerged as a critical task within the domain of Natural Language Processing (NLP)
We propose four ways of interactions, that are pipeline framework with tuned orders; knowledge distillation from Large Language Models (LLMs) to attention masking model; in-context learning with constructed parallel examples.
We empirically show these multi-way interactions can improve the baselines in certain perspective of style strength, content preservation and text fluency.
arXiv Detail & Related papers (2024-02-21T09:28:02Z) - Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation [82.5217996570387]
We adapt a pre-trained language model for auto-regressive text-to-image generation.
We find that pre-trained language models offer limited help.
arXiv Detail & Related papers (2023-11-27T07:19:26Z) - Prefix-Tuning Based Unsupervised Text Style Transfer [29.86587278794342]
Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content.
In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer.
arXiv Detail & Related papers (2023-10-23T06:13:08Z) - Shifted Diffusion for Text-to-image Generation [65.53758187995744]
Corgi is based on our proposed shifted diffusion model, which achieves better image embedding generation from input text.
Corgi also achieves new state-of-the-art results across different datasets on downstream language-free text-to-image generation tasks.
arXiv Detail & Related papers (2022-11-24T03:25:04Z) - Replacing Language Model for Style Transfer [6.364517234783756]
We introduce replacing language model (RLM), a sequence-to-sequence language modeling framework for text style transfer (TST)
Our method autoregressively replaces each token of the source sentence with a text span that has a similar meaning but in the target style.
The new span is generated via a non-autoregressive masked language model, which can better preserve the local-contextual meaning of the replaced token.
arXiv Detail & Related papers (2022-11-14T13:35:55Z) - Robust Preference Learning for Storytelling via Contrastive
Reinforcement Learning [53.92465205531759]
Controlled automated story generation seeks to generate natural language stories satisfying constraints from natural language critiques or preferences.
We train a contrastive bi-encoder model to align stories with human critiques, building a general purpose preference model.
We further fine-tune the contrastive reward model using a prompt-learning technique to increase story generation robustness.
arXiv Detail & Related papers (2022-10-14T13:21:33Z) - Collocation2Text: Controllable Text Generation from Guide Phrases in
Russian [0.0]
Collocation2Text is a plug-and-play method for automatic controllable text generation in Russian.
The method is based on two interacting models: the autoregressive language ruGPT-3 model and the autoencoding language ruRoBERTa model.
Experiments on generating news articles using the proposed method showed its effectiveness for automatically generated fluent texts.
arXiv Detail & Related papers (2022-06-18T17:10:08Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - Second-Order Unsupervised Neural Dependency Parsing [52.331561380948564]
Most unsupervised dependencys are based on first-order probabilistic generative models that only consider local parent-child information.
Inspired by second-order supervised dependency parsing, we proposed a second-order extension of unsupervised neural dependency models that incorporate grandparent-child or sibling information.
Our joint model achieves a 10% improvement over the previous state-of-the-art on the full WSJ test set.
arXiv Detail & Related papers (2020-10-28T03:01:33Z) - Abstractive Text Summarization based on Language Model Conditioning and
Locality Modeling [4.525267347429154]
We train a Transformer-based neural model on the BERT language model.
In addition, we propose a new method of BERT-windowing, which allows chunk-wise processing of texts longer than the BERT window size.
The results of our models are compared to a baseline and the state-of-the-art models on the CNN/Daily Mail dataset.
arXiv Detail & Related papers (2020-03-29T14:00:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.