Related papers: Text Detoxification using Large Pre-trained Neural Models

Text Detoxification using Large Pre-trained Neural Models

URL: http://arxiv.org/abs/2109.08914v1
Date: Sat, 18 Sep 2021 11:55:32 GMT
Title: Text Detoxification using Large Pre-trained Neural Models
Authors: David Dale, Anton Voronov, Daryna Dementieva, Varvara Logacheva, Olga Kozlova, Nikita Semenov and Alexander Panchenko
Abstract summary: We present two novel unsupervised methods for eliminating toxicity in text. First method combines guidance of the generation process with small style-conditional language models. Second method uses BERT to replace toxic words with their non-offensive synonyms.
Score: 57.72086777177844
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We present two novel unsupervised methods for eliminating toxicity in text. Our first method combines two recent ideas: (1) guidance of the generation process with small style-conditional language models and (2) use of paraphrasing models to perform style transfer. We use a well-performing paraphraser guided by style-trained language models to keep the text content and remove toxicity. Our second method uses BERT to replace toxic words with their non-offensive synonyms. We make the method more flexible by enabling BERT to replace mask tokens with a variable number of words. Finally, we present the first large-scale comparative study of style transfer models on the task of toxicity removal. We compare our models with a number of methods for style transfer. The models are evaluated in a reference-free way using a combination of unsupervised style transfer metrics. Both methods we suggest yield new SOTA results.

Related papers

Unsupervised Text Style Transfer via LLMs and Attention Masking with Multi-way Interactions [18.64326057581588]
Unsupervised Text Style Transfer (UTST) has emerged as a critical task within the domain of Natural Language Processing (NLP) We propose four ways of interactions, that are pipeline framework with tuned orders; knowledge distillation from Large Language Models (LLMs) to attention masking model; in-context learning with constructed parallel examples. We empirically show these multi-way interactions can improve the baselines in certain perspective of style strength, content preservation and text fluency.
arXiv Detail & Related papers (2024-02-21T09:28:02Z)
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation [82.5217996570387]
We adapt a pre-trained language model for auto-regressive text-to-image generation. We find that pre-trained language models offer limited help.
arXiv Detail & Related papers (2023-11-27T07:19:26Z)
Prefix-Tuning Based Unsupervised Text Style Transfer [29.86587278794342]
Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content. In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer.
arXiv Detail & Related papers (2023-10-23T06:13:08Z)
Shifted Diffusion for Text-to-image Generation [65.53758187995744]
Corgi is based on our proposed shifted diffusion model, which achieves better image embedding generation from input text. Corgi also achieves new state-of-the-art results across different datasets on downstream language-free text-to-image generation tasks.
arXiv Detail & Related papers (2022-11-24T03:25:04Z)
Replacing Language Model for Style Transfer [6.364517234783756]
We introduce replacing language model (RLM), a sequence-to-sequence language modeling framework for text style transfer (TST) Our method autoregressively replaces each token of the source sentence with a text span that has a similar meaning but in the target style. The new span is generated via a non-autoregressive masked language model, which can better preserve the local-contextual meaning of the replaced token.
arXiv Detail & Related papers (2022-11-14T13:35:55Z)
Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning [53.92465205531759]
Controlled automated story generation seeks to generate natural language stories satisfying constraints from natural language critiques or preferences. We train a contrastive bi-encoder model to align stories with human critiques, building a general purpose preference model. We further fine-tune the contrastive reward model using a prompt-learning technique to increase story generation robustness.
arXiv Detail & Related papers (2022-10-14T13:21:33Z)
Collocation2Text: Controllable Text Generation from Guide Phrases in Russian [0.0]
Collocation2Text is a plug-and-play method for automatic controllable text generation in Russian. The method is based on two interacting models: the autoregressive language ruGPT-3 model and the autoencoding language ruRoBERTa model. Experiments on generating news articles using the proposed method showed its effectiveness for automatically generated fluent texts.
arXiv Detail & Related papers (2022-06-18T17:10:08Z)
Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z)
Second-Order Unsupervised Neural Dependency Parsing [52.331561380948564]
Most unsupervised dependencys are based on first-order probabilistic generative models that only consider local parent-child information. Inspired by second-order supervised dependency parsing, we proposed a second-order extension of unsupervised neural dependency models that incorporate grandparent-child or sibling information. Our joint model achieves a 10% improvement over the previous state-of-the-art on the full WSJ test set.
arXiv Detail & Related papers (2020-10-28T03:01:33Z)
Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling [4.525267347429154]
We train a Transformer-based neural model on the BERT language model. In addition, we propose a new method of BERT-windowing, which allows chunk-wise processing of texts longer than the BERT window size. The results of our models are compared to a baseline and the state-of-the-art models on the CNN/Daily Mail dataset.
arXiv Detail & Related papers (2020-03-29T14:00:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.