On the Vulnerability of Text Sanitization
- URL: http://arxiv.org/abs/2410.17052v1
- Date: Tue, 22 Oct 2024 14:31:53 GMT
- Title: On the Vulnerability of Text Sanitization
- Authors: Meng Tong, Kejiang Chen, Xiaojian Yuang, Jiayang Liu, Weiming Zhang, Nenghai Yu, Jie Zhang,
- Abstract summary: We propose theoretically optimal reconstruction attacks targeting text sanitization.
We derive their bounds on ASR as benchmarks for evaluating sanitization performance.
One of our attacks achieves a 46.4% improvement in ASR over the state-of-the-art baseline.
- Score: 60.162007426724564
- License:
- Abstract: Text sanitization, which employs differential privacy to replace sensitive tokens with new ones, represents a significant technique for privacy protection. Typically, its performance in preserving privacy is evaluated by measuring the attack success rate (ASR) of reconstruction attacks, where attackers attempt to recover the original tokens from the sanitized ones. However, current reconstruction attacks on text sanitization are developed empirically, making it challenging to accurately assess the effectiveness of sanitization. In this paper, we aim to provide a more accurate evaluation of sanitization effectiveness. Inspired by the works of Palamidessi et al., we implement theoretically optimal reconstruction attacks targeting text sanitization. We derive their bounds on ASR as benchmarks for evaluating sanitization performance. For real-world applications, we propose two practical reconstruction attacks based on these theoretical findings. Our experimental results underscore the necessity of reassessing these overlooked risks. Notably, one of our attacks achieves a 46.4% improvement in ASR over the state-of-the-art baseline, with a privacy budget of epsilon=4.0 on the SST-2 dataset. Our code is available at: https://github.com/mengtong0110/On-the-Vulnerability-of-Text-Sanitization.
Related papers
- CleanerCLIP: Fine-grained Counterfactual Semantic Augmentation for Backdoor Defense in Contrastive Learning [53.766434746801366]
We propose a fine-grained textbfText textbfAlignment textbfCleaner (TA-Cleaner) to cut off feature connections of backdoor triggers.
TA-Cleaner achieves state-of-the-art defensiveness among finetuning-based defense techniques.
arXiv Detail & Related papers (2024-09-26T07:35:23Z) - Towards Physical World Backdoor Attacks against Skeleton Action Recognition [21.261855773907616]
Skeleton Action Recognition (SAR) has attracted significant interest for its efficient representation of the human skeletal structure.
Recent studies have raised security concerns in SAR models, particularly their vulnerability to adversarial attacks.
We introduce the Physical Skeleton Backdoor Attacks (PSBA), the first exploration of physical backdoor attacks against SAR.
arXiv Detail & Related papers (2024-08-16T11:29:33Z) - Efficient Trigger Word Insertion [9.257916713112945]
Our main objective is to reduce the number of poisoned samples while still achieving a satisfactory Attack Success Rate (ASR) in text backdoor attacks.
We propose an efficient trigger word insertion strategy in terms of trigger word optimization and poisoned sample selection.
Our approach achieves an ASR of over 90% with only 10 poisoned samples in the dirty-label setting and requires merely 1.5% of the training data in the clean-label setting.
arXiv Detail & Related papers (2023-11-23T12:15:56Z) - Semantic-Preserving Adversarial Code Comprehension [75.76118224437974]
We propose Semantic-Preserving Adversarial Code Embeddings (SPACE) to find the worst-case semantic-preserving attacks.
Experiments and analysis demonstrate that SPACE can stay robust against state-of-the-art attacks while boosting the performance of PrLMs for code.
arXiv Detail & Related papers (2022-09-12T10:32:51Z) - SemAttack: Natural Textual Attacks via Different Semantic Spaces [26.97034787803082]
We propose an efficient framework to generate natural adversarial text by constructing different semantic perturbation functions.
We show that SemAttack is able to generate adversarial texts for different languages with high attack success rates.
Our generated adversarial texts are natural and barely affect human performance.
arXiv Detail & Related papers (2022-05-03T03:44:03Z) - Perturbations in the Wild: Leveraging Human-Written Text Perturbations
for Realistic Adversarial Attack and Defense [19.76930957323042]
ANTHRO inductively extracts over 600K human-written text perturbations in the wild and leverages them for realistic adversarial attack.
We find that adversarial texts generated by ANTHRO achieve the best trade-off between (1) attack success rate, (2) semantic preservation of the original text, and (3) stealthiness--i.e. indistinguishable from human writings.
arXiv Detail & Related papers (2022-03-19T16:00:01Z) - Efficient Sharpness-aware Minimization for Improved Training of Neural
Networks [146.2011175973769]
This paper proposes Efficient Sharpness Aware Minimizer (M) which boosts SAM s efficiency at no cost to its generalization performance.
M includes two novel and efficient training strategies-StochasticWeight Perturbation and Sharpness-Sensitive Data Selection.
We show, via extensive experiments on the CIFAR and ImageNet datasets, that ESAM enhances the efficiency over SAM from requiring 100% extra computations to 40% vis-a-vis bases.
arXiv Detail & Related papers (2021-10-07T02:20:37Z) - Differential Privacy for Text Analytics via Natural Text Sanitization [44.95170585853761]
This paper takes a direct approach to text sanitization. Our insight is to consider both sensitivity and similarity via our new local DP notion.
The sanitized texts also contribute to our sanitization-aware pretraining and fine-tuning, enabling privacy-preserving natural language processing over the BERT language model with promising utility.
arXiv Detail & Related papers (2021-06-02T15:15:10Z) - Post-Contextual-Bandit Inference [57.88785630755165]
Contextual bandit algorithms are increasingly replacing non-adaptive A/B tests in e-commerce, healthcare, and policymaking.
They can both improve outcomes for study participants and increase the chance of identifying good or even best policies.
To support credible inference on novel interventions at the end of the study, we still want to construct valid confidence intervals on average treatment effects, subgroup effects, or value of new policies.
arXiv Detail & Related papers (2021-06-01T12:01:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.