TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods
- URL: http://arxiv.org/abs/2407.21630v1
- Date: Wed, 31 Jul 2024 14:24:01 GMT
- Title: TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods
- Authors: Gabriel Loiseau, Damien Sileo, Damien Riquet, Maxime Meyer, Marc Tommasi,
- Abstract summary: Authorship obfuscation aims to disguise the identity of an author within a text.
This alteration needs to balance privacy and utility.
We propose TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization.
- Score: 5.239989658197324
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Authorship obfuscation aims to disguise the identity of an author within a text by altering the writing style, vocabulary, syntax, and other linguistic features associated with the text author. This alteration needs to balance privacy and utility. While strong obfuscation techniques can effectively hide the author's identity, they often degrade the quality and usefulness of the text for its intended purpose. Conversely, maintaining high utility tends to provide insufficient privacy, making it easier for an adversary to de-anonymize the author. Thus, achieving an optimal trade-off between these two conflicting objectives is crucial. In this paper, we propose TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization, a new unsupervised authorship obfuscation method whose goal is to optimize the privacy-utility trade-off by regenerating the entire text considering its downstream utility. Our approach leverages policy optimization as a fine-tuning paradigm over small language models in order to rewrite texts by preserving author identity and downstream task utility. We show that our approach largely reduce the accuracy of attackers while preserving utility. We make our code and models publicly available.
Related papers
- NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human [55.20137833039499]
We suggest sanitizing sensitive text using two common strategies used by humans.
We curate the first corpus, coined NAP2, through both crowdsourcing and the use of large language models.
arXiv Detail & Related papers (2024-06-06T05:07:44Z) - Keep It Private: Unsupervised Privatization of Online Text [13.381890596224867]
We introduce an automatic text privatization framework that fine-tunes a large language model via reinforcement learning to produce rewrites that balance soundness, sense, and privacy.
We evaluate it extensively on a large-scale test set of English Reddit posts by 68k authors composed of short-medium length texts.
arXiv Detail & Related papers (2024-05-16T17:12:18Z) - DeepEraser: Deep Iterative Context Mining for Generic Text Eraser [103.39279154750172]
DeepEraser is a recurrent architecture that erases the text in an image via iterative operations.
DeepEraser is notably compact with only 1.4M parameters and trained in an end-to-end manner.
arXiv Detail & Related papers (2024-02-29T12:39:04Z) - JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding
over Small Language Models [53.83273575102087]
We propose an unsupervised inference-time approach to authorship obfuscation.
We introduce JAMDEC, a user-controlled, inference-time algorithm for authorship obfuscation.
Our approach builds on small language models such as GPT2-XL in order to help avoid disclosing the original content to proprietary LLM's APIs.
arXiv Detail & Related papers (2024-02-13T19:54:29Z) - Semantics-Preserved Distortion for Personal Privacy Protection in Information Management [65.08939490413037]
This paper suggests a linguistically-grounded approach to distort texts while maintaining semantic integrity.
We present two distinct frameworks for semantic-preserving distortion: a generative approach and a substitutive approach.
We also explore privacy protection in a specific medical information management scenario, showing our method effectively limits sensitive data memorization.
arXiv Detail & Related papers (2022-01-04T04:01:05Z) - Protecting Anonymous Speech: A Generative Adversarial Network
Methodology for Removing Stylistic Indicators in Text [2.9005223064604078]
We develop a new approach to authorship anonymization by constructing a generative adversarial network.
Our fully automatic method achieves comparable results to other methods in terms of content preservation and fluency.
Our approach is able to generalize well to an open-set context and anonymize sentences from authors it has not encountered before.
arXiv Detail & Related papers (2021-10-18T17:45:56Z) - Style Pooling: Automatic Text Style Obfuscation for Improved
Classification Fairness [32.3545569050269]
Style of writing in job applications might reveal protected attributes of the candidate which could lead to bias in hiring decisions.
We propose a VAE-based framework that obfuscates stylistic features of human-generated text through style transfer by automatically re-writing the text itself.
arXiv Detail & Related papers (2021-09-10T02:17:21Z) - Semantic-Preserving Adversarial Text Attacks [85.32186121859321]
We propose a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) method to examine the vulnerability of deep models.
Our method achieves the highest attack success rates and semantics rates by changing the smallest number of words compared with existing methods.
arXiv Detail & Related papers (2021-08-23T09:05:18Z) - TextHide: Tackling Data Privacy in Language Understanding Tasks [54.11691303032022]
TextHide mitigates privacy risks without slowing down training or reducing accuracy.
It requires all participants to add a simple encryption step to prevent an eavesdropping attacker from recovering private text data.
We evaluate TextHide on the GLUE benchmark, and our experiments show that TextHide can effectively defend attacks on shared gradients or representations.
arXiv Detail & Related papers (2020-10-12T22:22:15Z) - A Girl Has A Name: Detecting Authorship Obfuscation [12.461503242570643]
Authorship attribution aims to identify the author of a text based on the stylometric analysis.
Authorship obfuscation aims to protect against authorship attribution by modifying a text's style.
We evaluate the stealthiness of state-of-the-art authorship obfuscation methods under an adversarial threat model.
arXiv Detail & Related papers (2020-05-02T04:52:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.