Related papers: Poisoning Attacks to Local Differential Privacy for Ranking Estimation

Poisoning Attacks to Local Differential Privacy for Ranking Estimation

URL: http://arxiv.org/abs/2506.24033v1
Date: Mon, 30 Jun 2025 16:39:02 GMT
Title: Poisoning Attacks to Local Differential Privacy for Ranking Estimation
Authors: Pei Zhan, Peng Tang, Yangzhuo Li, Puwen Wei, Shanqing Guo,
Abstract summary: Local differential privacy (LDP) involves users perturbing their inputs to provide plausible deniability of their data.<n>In this paper, we first introduce novel poisoning attacks for ranking estimation.<n>We propose corresponding strategies for kRR, OUE, and OLH protocols.
Score: 8.14832255549522
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Local differential privacy (LDP) involves users perturbing their inputs to provide plausible deniability of their data. However, this also makes LDP vulnerable to poisoning attacks. In this paper, we first introduce novel poisoning attacks for ranking estimation. These attacks are intricate, as fake attackers do not merely adjust the frequency of target items. Instead, they leverage a limited number of fake users to precisely modify frequencies, effectively altering item rankings to maximize gains. To tackle this challenge, we introduce the concepts of attack cost and optimal attack item (set), and propose corresponding strategies for kRR, OUE, and OLH protocols. For kRR, we iteratively select optimal attack items and allocate suitable fake users. For OUE, we iteratively determine optimal attack item sets and consider the incremental changes in item frequencies across different sets. Regarding OLH, we develop a harmonic cost function based on the pre-image of a hash to select that supporting a larger number of effective attack items. Lastly, we present an attack strategy based on confidence levels to quantify the probability of a successful attack and the number of attack iterations more precisely. We demonstrate the effectiveness of our attacks through theoretical and empirical evidence, highlighting the necessity for defenses against these attacks. The source code and data have been made available at https://github.com/LDP-user/LDP-Ranking.git.

Related papers

The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections [74.60337113759313]
Current defenses against jailbreaks and prompt injections are typically evaluated against a static set of harmful attack strings.<n>We argue that this evaluation process is flawed. Instead, we should evaluate defenses against adaptive attackers who explicitly modify their attack strategy to counter a defense's design.
arXiv Detail & Related papers (2025-10-10T05:51:04Z)
P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs [49.908234151374785]
During fine-tuning, large language models (LLMs) are increasingly vulnerable to data-poisoning backdoor attacks.<n>We propose Poison-to-Poison (P2P), a general and effective backdoor defense algorithm.<n>We show that P2P can neutralize malicious backdoors while preserving task performance.
arXiv Detail & Related papers (2025-10-06T05:45:23Z)
An Interpretable N-gram Perplexity Threat Model for Large Language Model Jailbreaks [87.64278063236847]
In this work, we propose a unified threat model for the principled comparison of jailbreak attacks.<n>Our threat model checks if a given jailbreak is likely to occur in the distribution of text.<n>We adapt popular attacks to this threat model, and, for the first time, benchmark these attacks on equal footing with it.
arXiv Detail & Related papers (2024-10-21T17:27:01Z)
Optimal Attack and Defense for Reinforcement Learning [11.36770403327493]
In adversarial RL, an external attacker has the power to manipulate the victim agent's interaction with the environment. We show the attacker's problem of designing a stealthy attack that maximizes its own expected reward. We argue that the optimal defense policy for the victim can be computed as the solution to a Stackelberg game.
arXiv Detail & Related papers (2023-11-30T21:21:47Z)
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models [62.72318564072706]
Reinforcement Learning with Human Feedback (RLHF) is a methodology designed to align Large Language Models (LLMs) with human preferences. Despite its advantages, RLHF relies on human annotators to rank the text. We propose RankPoison, a poisoning attack method on candidates' selection of preference rank flipping to reach certain malicious behaviors.
arXiv Detail & Related papers (2023-11-16T07:48:45Z)
DALA: A Distribution-Aware LoRA-Based Adversarial Attack against Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data. Recent attack methods can achieve a relatively high attack success rate (ASR) We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z)
Adversarial Attacks Neutralization via Data Set Randomization [3.655021726150369]
Adversarial attacks on deep learning models pose a serious threat to their reliability and security. We propose a new defense mechanism that is rooted on hyperspace projection. We show that our solution increases the robustness of deep learning models against adversarial attacks.
arXiv Detail & Related papers (2023-06-21T10:17:55Z)
Defending against the Label-flipping Attack in Federated Learning [5.769445676575767]
Federated learning (FL) provides autonomy and privacy by design to participating peers. The label-flipping (LF) attack is a targeted poisoning attack where the attackers poison their training data by flipping the labels of some examples. We propose a novel defense that first dynamically extracts those gradients from the peers' local updates.
arXiv Detail & Related papers (2022-07-05T12:02:54Z)
Membership Inference Attacks From First Principles [24.10746844866869]
A membership inference attack allows an adversary to query a trained machine learning model to predict whether or not a particular example was contained in the model's training dataset. These attacks are currently evaluated using average-case "accuracy" metrics that fail to characterize whether the attack can confidently identify any members of the training set. We argue that attacks should instead be evaluated by computing their true-positive rate at low false-positive rates, and find most prior attacks perform poorly when evaluated in this way. Our attack is 10x more powerful at low false-positive rates, and also strictly dominates prior attacks on existing metrics.
arXiv Detail & Related papers (2021-12-07T08:47:00Z)
Composite Adversarial Attacks [57.293211764569996]
Adversarial attack is a technique for deceiving Machine Learning (ML) models. In this paper, a new procedure called Composite Adrial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms. CAA beats 10 top attackers on 11 diverse defenses with less elapsed time.
arXiv Detail & Related papers (2020-12-10T03:21:16Z)
Subpopulation Data Poisoning Attacks [18.830579299974072]
Poisoning attacks against machine learning induce adversarial modification of data used by a machine learning algorithm to selectively change its output when it is deployed. We introduce a novel data poisoning attack called a emphsubpopulation attack, which is particularly relevant when datasets are large and diverse. We design a modular framework for subpopulation attacks, instantiate it with different building blocks, and show that the attacks are effective for a variety of datasets and machine learning models.
arXiv Detail & Related papers (2020-06-24T20:20:52Z)
RayS: A Ray Searching Method for Hard-label Adversarial Attack [99.72117609513589]
We present the Ray Searching attack (RayS), which greatly improves the hard-label attack effectiveness as well as efficiency. RayS attack can also be used as a sanity check for possible "falsely robust" models.
arXiv Detail & Related papers (2020-06-23T07:01:50Z)
Revisiting Membership Inference Under Realistic Assumptions [87.13552321332988]
We study membership inference in settings where some of the assumptions typically used in previous research are relaxed. This setting is more realistic than the balanced prior setting typically considered by researchers. We develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function.
arXiv Detail & Related papers (2020-05-21T20:17:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.