Towards Robust and Semantically Organised Latent Representations for
Unsupervised Text Style Transfer
- URL: http://arxiv.org/abs/2205.02309v1
- Date: Wed, 4 May 2022 20:04:24 GMT
- Title: Towards Robust and Semantically Organised Latent Representations for
Unsupervised Text Style Transfer
- Authors: Sharan Narasimhan, Suvodip Dey, Maunendra Sankar Desarkar
- Abstract summary: We introduce EPAAEs (versading Perturbed Adrial AutoEncoders) which completes this perturbation model.
We empirically show that this (a) produces a better organised latent space that clusters stylistically similar sentences together.
We also extend the text style transfer tasks to NLI datasets and show that these more complex definitions of style are learned best by EPAAE.
- Score: 6.467090475885798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies show that auto-encoder based approaches successfully perform
language generation, smooth sentence interpolation, and style transfer over
unseen attributes using unlabelled datasets in a zero-shot manner. The latent
space geometry of such models is organised well enough to perform on datasets
where the style is "coarse-grained" i.e. a small fraction of words alone in a
sentence are enough to determine the overall style label. A recent study uses a
discrete token-based perturbation approach to map "similar" sentences
("similar" defined by low Levenshtein distance/ high word overlap) close by in
latent space. This definition of "similarity" does not look into the underlying
nuances of the constituent words while mapping latent space neighbourhoods and
therefore fails to recognise sentences with different style-based semantics
while mapping latent neighbourhoods. We introduce EPAAEs (Embedding Perturbed
Adversarial AutoEncoders) which completes this perturbation model, by adding a
finely adjustable noise component on the continuous embeddings space. We
empirically show that this (a) produces a better organised latent space that
clusters stylistically similar sentences together, (b) performs best on a
diverse set of text style transfer tasks than similar denoising-inspired
baselines, and (c) is capable of fine-grained control of Style Transfer
strength. We also extend the text style transfer tasks to NLI datasets and show
that these more complex definitions of style are learned best by EPAAE. To the
best of our knowledge, extending style transfer to NLI tasks has not been
explored before.
Related papers
- ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style
Transfer [57.6482608202409]
Textual style transfer is the task of transforming stylistic properties of text while preserving meaning.
We introduce a novel diffusion-based framework for general-purpose style transfer that can be flexibly adapted to arbitrary target styles.
We validate the method on the Enron Email Corpus, with both human and automatic evaluations, and find that it outperforms strong baselines on formality, sentiment, and even authorship style transfer.
arXiv Detail & Related papers (2023-08-29T17:36:02Z) - Representation Of Lexical Stylistic Features In Language Models'
Embedding Space [28.60690854046176]
We show that it is possible to derive a vector representation for each of these stylistic notions from only a small number of seed pairs.
We conduct experiments on five datasets and find that static embeddings encode these features more accurately at the level of words and phrases.
The lower performance of contextualized representations at the word level is partially attributable to the anisotropy of their vector space.
arXiv Detail & Related papers (2023-05-29T23:44:26Z) - Sequential Integrated Gradients: a simple but effective method for
explaining language models [0.18459705687628122]
We propose a new method for explaining language models called Sequential Integrated Gradients ( SIG)
SIG computes the importance of each word in a sentence by keeping fixed every other words, only creatings between the baseline and word of interest.
We show on various models and datasets that SIG proves to be a very effective method for explaining language models.
arXiv Detail & Related papers (2023-05-25T08:44:11Z) - StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized
Tokenizer of a Large-Scale Generative Model [64.26721402514957]
We propose StylerDALLE, a style transfer method that uses natural language to describe abstract art styles.
Specifically, we formulate the language-guided style transfer task as a non-autoregressive token sequence translation.
To incorporate style information, we propose a Reinforcement Learning strategy with CLIP-based language supervision.
arXiv Detail & Related papers (2023-03-16T12:44:44Z) - Zero-Shot Cross-Lingual Dependency Parsing through Contextual Embedding
Transformation [7.615096161060399]
Cross-lingual embedding space mapping is usually studied in static word-level embeddings.
We investigate a contextual embedding alignment approach which is sense-level and dictionary-free.
Experiments on zero-shot dependency parsing through the concept-shared space built by our embedding transformation substantially outperform state-of-the-art methods using multilingual embeddings.
arXiv Detail & Related papers (2021-03-03T06:50:43Z) - GTAE: Graph-Transformer based Auto-Encoders for Linguistic-Constrained
Text Style Transfer [119.70961704127157]
Non-parallel text style transfer has attracted increasing research interests in recent years.
Current approaches still lack the ability to preserve the content and even logic of original sentences.
We propose a method called Graph Transformer based Auto-GTAE, which models a sentence as a linguistic graph and performs feature extraction and style transfer at the graph level.
arXiv Detail & Related papers (2021-02-01T11:08:45Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Exploring Contextual Word-level Style Relevance for Unsupervised Style
Transfer [60.07283363509065]
Unsupervised style transfer aims to change the style of an input sentence while preserving its original content.
We propose a novel attentional sequence-to-sequence model that exploits the relevance of each output word to the target style.
Experimental results show that our proposed model achieves state-of-the-art performance in terms of both transfer accuracy and content preservation.
arXiv Detail & Related papers (2020-05-05T10:24:28Z) - Contextual Text Style Transfer [73.66285813595616]
Contextual Text Style Transfer aims to translate a sentence into a desired style with its surrounding context taken into account.
We propose a Context-Aware Style Transfer (CAST) model, which uses two separate encoders for each input sentence and its surrounding context.
Two new benchmarks, Enron-Context and Reddit-Context, are introduced for formality and offensiveness style transfer.
arXiv Detail & Related papers (2020-04-30T23:01:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.