Related papers: Reproduction and Replication of an Adversarial Stylometry Experiment

Reproduction and Replication of an Adversarial Stylometry Experiment

URL: http://arxiv.org/abs/2208.07395v1
Date: Mon, 15 Aug 2022 18:24:00 GMT
Title: Reproduction and Replication of an Adversarial Stylometry Experiment
Authors: Haining Wang, Patrick Juola, Allen Riddell
Abstract summary: This paper reproduces and replicates experiments in a seminal study of defenses against authorship attribution. We find new evidence suggesting that an entirely automatic method, round-trip translation, merits re-examination.
Score: 8.374836126235499
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Maintaining anonymity while communicating using natural language remains a challenge. Standard authorship attribution techniques that analyze candidate authors' writing styles achieve uncomfortably high accuracy even when the number of candidate authors is high. Adversarial stylometry defends against authorship attribution with the goal of preventing unwanted deanonymization. This paper reproduces and replicates experiments in a seminal study of defenses against authorship attribution (Brennan et al., 2012). We are able to successfully reproduce and replicate the original results, although we conclude that the effectiveness of the defenses studied is overstated due to a lack of a control group in the original study. In our replication, we find new evidence suggesting that an entirely automatic method, round-trip translation, merits re-examination as it appears to reduce the effectiveness of established authorship attribution methods.

Related papers

In-depth Research Impact Summarization through Fine-Grained Temporal Citation Analysis [52.42612945266194]
We propose a new task: generating nuanced, expressive, and time-aware impact summaries.<n>We show that these summaries capture both praise (confirmation citations) and critique (correction citations) through the evolution of fine-grained citation intents.
arXiv Detail & Related papers (2025-05-20T19:11:06Z)
CopyJudge: Automated Copyright Infringement Identification and Mitigation in Text-to-Image Diffusion Models [58.58208005178676]
We propose CopyJudge, an automated copyright infringement identification framework. We employ an abstraction-filtration-comparison test framework with multi-LVLM debate to assess the likelihood of infringement. Based on the judgments, we introduce a general LVLM-based mitigation strategy.
arXiv Detail & Related papers (2025-02-21T08:09:07Z)
Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions [51.51850981481236]
We introduce POATE, a novel jailbreak technique that harnesses contrastive reasoning to provoke unethical responses. PoATE crafts semantically opposing intents and integrates them with adversarial templates, steering models toward harmful outputs with remarkable subtlety. To counter this, we propose Intent-Aware CoT and Reverse Thinking CoT, which decompose queries to detect malicious intent and reason in reverse to evaluate and reject harmful responses.
arXiv Detail & Related papers (2025-01-03T15:40:03Z)
CopyrightShield: Spatial Similarity Guided Backdoor Defense against Copyright Infringement in Diffusion Models [61.06621533874629]
diffusion model is a prime target for copyright infringement attacks. This paper provides an in-depth analysis of the spatial similarity of replication in diffusion model. We propose a novel defense method specifically targeting copyright infringement attacks.
arXiv Detail & Related papers (2024-12-02T14:19:44Z)
CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation [132.00910067533982]
We introduce CopyBench, a benchmark designed to measure both literal and non-literal copying in LM generations. We find that, although literal copying is relatively rare, two types of non-literal copying -- event copying and character copying -- occur even in models as small as 7B parameters.
arXiv Detail & Related papers (2024-07-09T17:58:18Z)
Keep It Private: Unsupervised Privatization of Online Text [13.381890596224867]
We introduce an automatic text privatization framework that fine-tunes a large language model via reinforcement learning to produce rewrites that balance soundness, sense, and privacy. We evaluate it extensively on a large-scale test set of English Reddit posts by 68k authors composed of short-medium length texts.
arXiv Detail & Related papers (2024-05-16T17:12:18Z)
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is a large language model-powered research idea writing agent. It generates problems, methods, and experiment designs while iteratively refining them based on scientific literature. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z)
BERT-Enhanced Retrieval Tool for Homework Plagiarism Detection System [0.0]
We propose a plagiarized text data generation method based on GPT-3.5, which produces 32,927 pairs of text plagiarism detection datasets. We also propose a plagiarism identification method based on Faiss with BERT with high efficiency and high accuracy. Our experiments show that the performance of this model outperforms other models in several metrics, including 98.86%, 98.90%, 98.86%, and 0.9888 for Accuracy, Precision, Recall, and F1 Score.
arXiv Detail & Related papers (2024-04-01T12:20:34Z)
Forging the Forger: An Attempt to Improve Authorship Verification via Data Augmentation [52.72682366640554]
Authorship Verification (AV) is a text classification task concerned with inferring whether a candidate text has been written by one specific author or by someone else. It has been shown that many AV systems are vulnerable to adversarial attacks, where a malicious author actively tries to fool the classifier by either concealing their writing style, or by imitating the style of another author.
arXiv Detail & Related papers (2024-03-17T16:36:26Z)
Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios. Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples. We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z)
Learning with Rejection for Abstractive Text Summarization [42.15551472507393]
We propose a training objective for abstractive summarization based on rejection learning. We show that our method considerably improves the factuality of generated summaries in automatic and human evaluations.
arXiv Detail & Related papers (2023-02-16T19:07:08Z)
May the Force Be with Your Copy Mechanism: Enhanced Supervised-Copy Method for Natural Language Generation [1.2453219864236247]
We propose a novel supervised approach of a copy network that helps the model decide which words need to be copied and which need to be generated. Specifically, we re-define the objective function, which leverages source sequences and target vocabularies as guidance for copying. The experimental results on data-to-text generation and abstractive summarization tasks verify that our approach enhances the copying quality and improves the degree of abstractness.
arXiv Detail & Related papers (2021-12-20T06:54:28Z)
Tortured phrases: A dubious writing style emerging in science. Evidence of critical issues affecting established journals [69.76097138157816]
Probabilistic text generators have been used to produce fake scientific papers for more than a decade. Complex AI-powered generation techniques produce texts indistinguishable from that of humans. Some websites offer to rewrite texts for free, generating gobbledegook full of tortured phrases.
arXiv Detail & Related papers (2021-07-12T20:47:08Z)
Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling [13.722693312120462]
Adversarial stylometry intends to attack such models by rewriting an author's text. Our research proposes several components to facilitate deployment of these adversarial attacks in the wild.
arXiv Detail & Related papers (2021-01-27T10:42:44Z)
Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News [57.9843300852526]
We introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions. To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles. In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies.
arXiv Detail & Related papers (2020-09-16T14:13:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.