With Great Dispersion Comes Greater Resilience: Efficient Poisoning
Attacks and Defenses for Linear Regression Models
- URL: http://arxiv.org/abs/2006.11928v5
- Date: Wed, 19 May 2021 07:51:43 GMT
- Title: With Great Dispersion Comes Greater Resilience: Efficient Poisoning
Attacks and Defenses for Linear Regression Models
- Authors: Jialin Wen, Benjamin Zi Hao Zhao, Minhui Xue, Alina Oprea and Haifeng
Qian
- Abstract summary: We analyze how attackers may interfere with the results of regression learning by poisoning datasets.
Our attack, termed Nopt, can produce larger errors with the same proportion of poisoning data-points.
Our new defense algorithm, termed Proda, demonstrates an increased effectiveness in reducing errors.
- Score: 28.680562906669216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rise of third parties in the machine learning pipeline, the service
provider in "Machine Learning as a Service" (MLaaS), or external data
contributors in online learning, or the retraining of existing models, the need
to ensure the security of the resulting machine learning models has become an
increasingly important topic. The security community has demonstrated that
without transparency of the data and the resulting model, there exist many
potential security risks, with new risks constantly being discovered.
In this paper, we focus on one of these security risks -- poisoning attacks.
Specifically, we analyze how attackers may interfere with the results of
regression learning by poisoning the training datasets. To this end, we analyze
and develop a new poisoning attack algorithm. Our attack, termed Nopt, in
contrast with previous poisoning attack algorithms, can produce larger errors
with the same proportion of poisoning data-points. Furthermore, we also
significantly improve the state-of-the-art defense algorithm, termed TRIM,
proposed by Jagielsk et al. (IEEE S&P 2018), by incorporating the concept of
probability estimation of clean data-points into the algorithm. Our new defense
algorithm, termed Proda, demonstrates an increased effectiveness in reducing
errors arising from the poisoning dataset through optimizing ensemble models.
We highlight that the time complexity of TRIM had not been estimated; however,
we deduce from their work that TRIM can take exponential time complexity in the
worst-case scenario, in excess of Proda's logarithmic time. The performance of
both our proposed attack and defense algorithms is extensively evaluated on
four real-world datasets of housing prices, loans, health care, and bike
sharing services. We hope that our work will inspire future research to develop
more robust learning algorithms immune to poisoning attacks.
Related papers
- Data Poisoning and Leakage Analysis in Federated Learning [10.090442512374661]
Data poisoning and leakage risks impede the massive deployment of federated learning in the real world.
This chapter reveals the truths and pitfalls of understanding two dominating threats: em training data privacy intrusion and em training data poisoning
arXiv Detail & Related papers (2024-09-19T16:50:29Z) - Have You Poisoned My Data? Defending Neural Networks against Data Poisoning [0.393259574660092]
We propose a novel approach to detect and filter poisoned datapoints in the transfer learning setting.
We show that effective poisons can be successfully differentiated from clean points in the characteristic vector space.
Our evaluation shows that our proposal outperforms existing approaches in defense rate and final trained model performance.
arXiv Detail & Related papers (2024-03-20T11:50:16Z) - Privacy-Preserving Distributed Learning for Residential Short-Term Load
Forecasting [11.185176107646956]
Power system load data can inadvertently reveal the daily routines of residential users, posing a risk to their property security.
We introduce a Markovian Switching-based distributed training framework, the convergence of which is substantiated through rigorous theoretical analysis.
Case studies employing real-world power system load data validate the efficacy of our proposed algorithm.
arXiv Detail & Related papers (2024-02-02T16:39:08Z) - On Practical Aspects of Aggregation Defenses against Data Poisoning
Attacks [58.718697580177356]
Attacks on deep learning models with malicious training samples are known as data poisoning.
Recent advances in defense strategies against data poisoning have highlighted the effectiveness of aggregation schemes in achieving certified poisoning robustness.
Here we focus on Deep Partition Aggregation, a representative aggregation defense, and assess its practical aspects, including efficiency, performance, and robustness.
arXiv Detail & Related papers (2023-06-28T17:59:35Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks.
This work proposes a data selection strategy to be applied in the mini-batch training.
The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Wild Patterns Reloaded: A Survey of Machine Learning Security against
Training Data Poisoning [32.976199681542845]
We provide a comprehensive systematization of poisoning attacks and defenses in machine learning.
We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly.
We argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities.
arXiv Detail & Related papers (2022-05-04T11:00:26Z) - Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Data Poisoning Attacks on Regression Learning and Corresponding Defenses [0.0]
Adversarial data poisoning is an effective attack against machine learning and threatens model integrity by introducing poisoned data into the training dataset.
We present realistic scenarios in which data poisoning attacks threaten production systems and introduce a novel black-box attack.
As a result, we observe that the mean squared error (MSE) of the regressor increases to 150 percent due to inserting only two percent of poison samples.
arXiv Detail & Related papers (2020-09-15T12:14:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.