Generating Poisoning Attacks against Ridge Regression Models with Categorical Features
- URL: http://arxiv.org/abs/2501.07275v1
- Date: Mon, 13 Jan 2025 12:40:52 GMT
- Title: Generating Poisoning Attacks against Ridge Regression Models with Categorical Features
- Authors: Monse Guedes-Ayala, Lars Schewe, Zeynep Suvak, Miguel Anjos,
- Abstract summary: Machine Learning (ML) models have become a very powerful tool to extract information from large datasets.
ML models can be vulnerable to external attacks, causing them to underperform or deviate from their expected tasks.
In this paper, we propose to generate strong attacks for a ridge regression model containing both numerical categorical features that explicitly poisons categorical features.
- Score: 0.0
- License:
- Abstract: Machine Learning (ML) models have become a very powerful tool to extract information from large datasets and use it to make accurate predictions and automated decisions. However, ML models can be vulnerable to external attacks, causing them to underperform or deviate from their expected tasks. One way to attack ML models is by injecting malicious data to mislead the algorithm during the training phase, which is referred to as a poisoning attack. We can prepare for such situations by designing anticipated attacks, which are later used for creating and testing defence strategies. In this paper, we propose an algorithm to generate strong poisoning attacks for a ridge regression model containing both numerical and categorical features that explicitly models and poisons categorical features. We model categorical features as SOS-1 sets and formulate the problem of designing poisoning attacks as a bilevel optimization problem that is nonconvex mixed-integer in the upper-level and unconstrained convex quadratic in the lower-level. We present the mathematical formulation of the problem, introduce a single-level reformulation based on the Karush-Kuhn-Tucker (KKT) conditions of the lower level, find bounds for the lower-level variables to accelerate solver performance, and propose a new algorithm to poison categorical features. Numerical experiments show that our method improves the mean squared error of all datasets compared to the previous benchmark in the literature.
Related papers
- Data-Agnostic Model Poisoning against Federated Learning: A Graph
Autoencoder Approach [65.2993866461477]
This paper proposes a data-agnostic, model poisoning attack on Federated Learning (FL)
The attack requires no knowledge of FL training data and achieves both effectiveness and undetectability.
Experiments show that the FL accuracy drops gradually under the proposed attack and existing defense mechanisms fail to detect it.
arXiv Detail & Related papers (2023-11-30T12:19:10Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - Hyperparameter Learning under Data Poisoning: Analysis of the Influence
of Regularization via Multiobjective Bilevel Optimization [3.3181276611945263]
Machine Learning (ML) algorithms are vulnerable to poisoning attacks, where a fraction of the training data is manipulated to deliberately degrade the algorithms' performance.
Optimal attacks can be formulated as bilevel optimization problems and help to assess their robustness in worst-case scenarios.
arXiv Detail & Related papers (2023-06-02T15:21:05Z) - Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning
Attacks [31.339252233416477]
We introduce the notion of model poisoning reachability as a technical tool to explore the intrinsic limits of data poisoning attacks towards target parameters.
We derive an easily computable threshold to establish and quantify a surprising phase transition phenomenon among popular ML models.
Our work highlights the critical role played by the poisoning ratio, and sheds new insights on existing empirical results, attacks and mitigation strategies in data poisoning.
arXiv Detail & Related papers (2023-03-07T01:55:26Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Regularization Can Help Mitigate Poisoning Attacks... with the Right
Hyperparameters [1.8570591025615453]
Machine learning algorithms are vulnerable to poisoning attacks, where a fraction of the training data is manipulated to degrade the algorithms' performance.
We show that current approaches, which typically assume that regularization hyper parameters remain constant, lead to an overly pessimistic view of the algorithms' robustness.
We propose a novel optimal attack formulation that considers the effect of the attack on the hyper parameters, modelling the attack as a emphminimax bilevel optimization problem.
arXiv Detail & Related papers (2021-05-23T14:34:47Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z) - Adversarial Poisoning Attacks and Defense for General Multi-Class Models
Based On Synthetic Reduced Nearest Neighbors [14.968442560499753]
State-of-the-art machine learning models are vulnerable to data poisoning attacks.
This paper proposes a novel model-free label-flipping attack based on the multi-modality of the data.
Second, a novel defense technique based on the Synthetic Reduced Nearest Neighbor (SRNN) model is proposed.
arXiv Detail & Related papers (2021-02-11T06:55:40Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Regularisation Can Mitigate Poisoning Attacks: A Novel Analysis Based on
Multiobjective Bilevel Optimisation [3.3181276611945263]
Machine Learning (ML) algorithms are vulnerable to poisoning attacks, where a fraction of the training data is manipulated to deliberately degrade the algorithms' performance.
Optimal poisoning attacks, which can be formulated as bilevel problems, help to assess the robustness of learning algorithms in worst-case scenarios.
We show that this approach leads to an overly pessimistic view of the robustness of the algorithms.
arXiv Detail & Related papers (2020-02-28T19:46:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.