Related papers: Defending Against Authorship Identification Attacks

Defending Against Authorship Identification Attacks

URL: http://arxiv.org/abs/2310.01568v1
Date: Mon, 2 Oct 2023 19:03:11 GMT
Title: Defending Against Authorship Identification Attacks
Authors: Haining Wang
Abstract summary: Authorship identification has proven unsettlingly effective in inferring the identity of the author of an unsigned document. The presented work offers a comprehensive review of the advancements in this research area spanning over the past two decades and beyond.
Score: 9.148691357200216
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Authorship identification has proven unsettlingly effective in inferring the identity of the author of an unsigned document, even when sensitive personal information has been carefully omitted. In the digital era, individuals leave a lasting digital footprint through their written content, whether it is posted on social media, stored on their employer's computers, or located elsewhere. When individuals need to communicate publicly yet wish to remain anonymous, there is little available to protect them from unwanted authorship identification. This unprecedented threat to privacy is evident in scenarios such as whistle-blowing. Proposed defenses against authorship identification attacks primarily aim to obfuscate one's writing style, thereby making it unlinkable to their pre-existing writing, while concurrently preserving the original meaning and grammatical integrity. The presented work offers a comprehensive review of the advancements in this research area spanning over the past two decades and beyond. It emphasizes the methodological frameworks of modification and generation-based strategies devised to evade authorship identification attacks, highlighting joint efforts from the differential privacy community. Limitations of current research are discussed, with a spotlight on open challenges and potential research avenues.

Related papers

Privacy-Preserving Biometric Verification with Handwritten Random Digit String [49.77172854374479]
Handwriting verification has stood as a steadfast identity authentication method for decades. However, this technique risks potential privacy breaches due to the inclusion of personal information in handwritten biometrics such as signatures. We propose using the Random Digit String (RDS) for privacy-preserving handwriting verification.
arXiv Detail & Related papers (2025-03-17T03:47:25Z)
RedactBuster: Entity Type Recognition from Redacted Documents [13.172863061928899]
We propose RedactBuster, the first deanonymization model using sentence context to perform Named Entity Recognition on reacted text. We test RedactBuster against the most effective redaction technique and evaluate it using the publicly available Text Anonymization Benchmark (TAB) Our results show accuracy values up to 0.985 regardless of the document nature or entity type.
arXiv Detail & Related papers (2024-04-19T16:42:44Z)
Privacy-preserving Optics for Enhancing Protection in Face De-identification [60.110274007388135]
We propose a hardware-level face de-identification method to solve this vulnerability. We also propose an anonymization framework that generates a new face using the privacy-preserving image, face heatmap, and a reference face image from a public dataset as input.
arXiv Detail & Related papers (2024-03-31T19:28:04Z)
JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models [53.83273575102087]
We propose an unsupervised inference-time approach to authorship obfuscation. We introduce JAMDEC, a user-controlled, inference-time algorithm for authorship obfuscation. Our approach builds on small language models such as GPT2-XL in order to help avoid disclosing the original content to proprietary LLM's APIs.
arXiv Detail & Related papers (2024-02-13T19:54:29Z)
Disentangle Before Anonymize: A Two-stage Framework for Attribute-preserved and Occlusion-robust De-identification [55.741525129613535]
"Disentangle Before Anonymize" is a novel two-stage Framework(DBAF) This framework includes a Contrastive Identity Disentanglement (CID) module and a Key-authorized Reversible Identity Anonymization (KRIA) module. Extensive experiments demonstrate that our method outperforms state-of-the-art de-identification approaches.
arXiv Detail & Related papers (2023-11-15T08:59:02Z)
Diff-Privacy: Diffusion-based Face Privacy Protection [58.1021066224765]
In this paper, we propose a novel face privacy protection method based on diffusion models, dubbed Diff-Privacy. Specifically, we train our proposed multi-scale image inversion module (MSI) to obtain a set of SDM format conditional embeddings of the original image. Based on the conditional embeddings, we design corresponding embedding scheduling strategies and construct different energy functions during the denoising process to achieve anonymization and visual identity information hiding.
arXiv Detail & Related papers (2023-09-11T09:26:07Z)
User-Centered Security in Natural Language Processing [0.7106986689736825]
dissertation proposes a framework of user-centered security in Natural Language Processing (NLP) It focuses on two security domains within NLP with great public interest.
arXiv Detail & Related papers (2023-01-10T22:34:19Z)
Unsupervised Text Deidentification [101.2219634341714]
We propose an unsupervised deidentification method that masks words that leak personally-identifying information. Motivated by K-anonymity based privacy, we generate redactions that ensure a minimum reidentification rank.
arXiv Detail & Related papers (2022-10-20T18:54:39Z)
Statistical anonymity: Quantifying reidentification risks without reidentifying users [4.103598036312231]
Data anonymization is an approach to privacy-preserving data release aimed at preventing participants reidentification. Existing algorithms for enforcing $k$-anonymity in the released data assume that the curator performing the anonymization has complete access to the original data. This paper explores ideas for reducing the trust that must be placed in the curator, while still maintaining a statistical notion of $k$-anonymity.
arXiv Detail & Related papers (2022-01-28T18:12:44Z)
Protecting Anonymous Speech: A Generative Adversarial Network Methodology for Removing Stylistic Indicators in Text [2.9005223064604078]
We develop a new approach to authorship anonymization by constructing a generative adversarial network. Our fully automatic method achieves comparable results to other methods in terms of content preservation and fluency. Our approach is able to generalize well to an open-set context and anonymize sentences from authors it has not encountered before.
arXiv Detail & Related papers (2021-10-18T17:45:56Z)
No Intruder, no Validity: Evaluation Criteria for Privacy-Preserving Text Anonymization [0.48733623015338234]
We argue that researchers and practitioners developing automated text anonymization systems should carefully assess whether their evaluation methods truly reflect the system's ability to protect individuals from being re-identified. We propose TILD, a set of evaluation criteria that comprises an anonymization method's technical performance, the information loss resulting from its anonymization, and the human ability to de-anonymize redacted documents.
arXiv Detail & Related papers (2021-03-16T18:18:29Z)
Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News [57.9843300852526]
We introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions. To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles. In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies.
arXiv Detail & Related papers (2020-09-16T14:13:15Z)
Towards Face Encryption by Generating Adversarial Identity Masks [53.82211571716117]
We propose a targeted identity-protection iterative method (TIP-IM) to generate adversarial identity masks. TIP-IM provides 95%+ protection success rate against various state-of-the-art face recognition models.
arXiv Detail & Related papers (2020-03-15T12:45:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.