Style Pooling: Automatic Text Style Obfuscation for Improved
Classification Fairness
- URL: http://arxiv.org/abs/2109.04624v1
- Date: Fri, 10 Sep 2021 02:17:21 GMT
- Title: Style Pooling: Automatic Text Style Obfuscation for Improved
Classification Fairness
- Authors: Fatemehsadat Mireshghallah, Taylor Berg-Kirkpatrick
- Abstract summary: Style of writing in job applications might reveal protected attributes of the candidate which could lead to bias in hiring decisions.
We propose a VAE-based framework that obfuscates stylistic features of human-generated text through style transfer by automatically re-writing the text itself.
- Score: 32.3545569050269
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text style can reveal sensitive attributes of the author (e.g. race or age)
to the reader, which can, in turn, lead to privacy violations and bias in both
human and algorithmic decisions based on text. For example, the style of
writing in job applications might reveal protected attributes of the candidate
which could lead to bias in hiring decisions, regardless of whether hiring
decisions are made algorithmically or by humans. We propose a VAE-based
framework that obfuscates stylistic features of human-generated text through
style transfer by automatically re-writing the text itself. Our framework
operationalizes the notion of obfuscated style in a flexible way that enables
two distinct notions of obfuscated style: (1) a minimal notion that effectively
intersects the various styles seen in training, and (2) a maximal notion that
seeks to obfuscate by adding stylistic features of all sensitive attributes to
text, in effect, computing a union of styles. Our style-obfuscation framework
can be used for multiple purposes, however, we demonstrate its effectiveness in
improving the fairness of downstream classifiers. We also conduct a
comprehensive study on style pooling's effect on fluency, semantic consistency,
and attribute removal from text, in two and three domain style obfuscation.
Related papers
- RealCustom: Narrowing Real Text Word for Real-Time Open-Domain
Text-to-Image Customization [57.86083349873154]
Text-to-image customization aims to synthesize text-driven images for the given subjects.
Existing works follow the pseudo-word paradigm, i.e., represent the given subjects as pseudo-words and then compose them with the given text.
We present RealCustom that, for the first time, disentangles similarity from controllability by precisely limiting subject influence to relevant parts only.
arXiv Detail & Related papers (2024-03-01T12:12:09Z) - Learning to Generate Text in Arbitrary Writing Styles [6.7308816341849695]
It is desirable for language models to produce text in an author-specific style on the basis of a potentially small writing sample.
We propose to guide a language model to generate text in a target style using contrastively-trained representations that capture stylometric features.
arXiv Detail & Related papers (2023-12-28T18:58:52Z) - ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style
Transfer [57.6482608202409]
Textual style transfer is the task of transforming stylistic properties of text while preserving meaning.
We introduce a novel diffusion-based framework for general-purpose style transfer that can be flexibly adapted to arbitrary target styles.
We validate the method on the Enron Email Corpus, with both human and automatic evaluations, and find that it outperforms strong baselines on formality, sentiment, and even authorship style transfer.
arXiv Detail & Related papers (2023-08-29T17:36:02Z) - MSSRNet: Manipulating Sequential Style Representation for Unsupervised
Text Style Transfer [82.37710853235535]
Unsupervised text style transfer task aims to rewrite a text into target style while preserving its main content.
Traditional methods rely on the use of a fixed-sized vector to regulate text style, which is difficult to accurately convey the style strength for each individual token.
Our proposed method addresses this issue by assigning individual style vector to each token in a text, allowing for fine-grained control and manipulation of the style strength.
arXiv Detail & Related papers (2023-06-12T13:12:29Z) - Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain [53.22419717434372]
We propose a new task, namely stylized data-to-text generation, whose aim is to generate coherent text according to a specific style.
This task is non-trivial, due to three challenges: the logic of the generated text, unstructured style reference, and biased training samples.
We propose a novel stylized data-to-text generation model, named StyleD2T, comprising three components: logic planning-enhanced data embedding, mask-based style embedding, and unbiased stylized text generation.
arXiv Detail & Related papers (2023-05-05T03:02:41Z) - Conversation Style Transfer using Few-Shot Learning [56.43383396058639]
In this paper, we introduce conversation style transfer as a few-shot learning problem.
We propose a novel in-context learning approach to solve the task with style-free dialogues as a pivot.
We show that conversation style transfer can also benefit downstream tasks.
arXiv Detail & Related papers (2023-02-16T15:27:00Z) - SLOGAN: Handwriting Style Synthesis for Arbitrary-Length and
Out-of-Vocabulary Text [35.83345711291558]
We propose a novel method that can synthesize parameterized and controllable handwriting Styles for arbitrary-Length and Out-of-vocabulary text.
We embed the text content by providing an easily obtainable printed style image, so that the diversity of the content can be flexibly achieved.
Our method can synthesize words that are not included in the training vocabulary and with various new styles.
arXiv Detail & Related papers (2022-02-23T12:13:27Z) - From Theories on Styles to their Transfer in Text: Bridging the Gap with
a Hierarchical Survey [10.822011920177408]
Style transfer aims at re-writing existing texts and creating paraphrases that exhibit desired stylistic attributes.
A handful of surveys give a methodological overview of the field, but they do not support researchers to focus on specific styles.
We organize them into a hierarchy, highlighting the challenges for the definition of each of them, and pointing out gaps in the current research landscape.
arXiv Detail & Related papers (2021-10-29T15:53:06Z) - Protecting Anonymous Speech: A Generative Adversarial Network
Methodology for Removing Stylistic Indicators in Text [2.9005223064604078]
We develop a new approach to authorship anonymization by constructing a generative adversarial network.
Our fully automatic method achieves comparable results to other methods in terms of content preservation and fluency.
Our approach is able to generalize well to an open-set context and anonymize sentences from authors it has not encountered before.
arXiv Detail & Related papers (2021-10-18T17:45:56Z) - Separating Content from Style Using Adversarial Learning for Recognizing
Text in the Wild [103.51604161298512]
We propose an adversarial learning framework for the generation and recognition of multiple characters in an image.
Our framework can be integrated into recent recognition methods to achieve new state-of-the-art recognition accuracy.
arXiv Detail & Related papers (2020-01-13T12:41:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.