Related papers: With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks and Defenses for Linear Regression Models

With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks and Defenses for Linear Regression Models

URL: http://arxiv.org/abs/2006.11928v5
Date: Wed, 19 May 2021 07:51:43 GMT
Title: With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks and Defenses for Linear Regression Models
Authors: Jialin Wen, Benjamin Zi Hao Zhao, Minhui Xue, Alina Oprea and Haifeng Qian
Abstract summary: We analyze how attackers may interfere with the results of regression learning by poisoning datasets. Our attack, termed Nopt, can produce larger errors with the same proportion of poisoning data-points. Our new defense algorithm, termed Proda, demonstrates an increased effectiveness in reducing errors.
Score: 28.680562906669216
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rise of third parties in the machine learning pipeline, the service provider in "Machine Learning as a Service" (MLaaS), or external data contributors in online learning, or the retraining of existing models, the need to ensure the security of the resulting machine learning models has become an increasingly important topic. The security community has demonstrated that without transparency of the data and the resulting model, there exist many potential security risks, with new risks constantly being discovered. In this paper, we focus on one of these security risks -- poisoning attacks. Specifically, we analyze how attackers may interfere with the results of regression learning by poisoning the training datasets. To this end, we analyze and develop a new poisoning attack algorithm. Our attack, termed Nopt, in contrast with previous poisoning attack algorithms, can produce larger errors with the same proportion of poisoning data-points. Furthermore, we also significantly improve the state-of-the-art defense algorithm, termed TRIM, proposed by Jagielsk et al. (IEEE S&P 2018), by incorporating the concept of probability estimation of clean data-points into the algorithm. Our new defense algorithm, termed Proda, demonstrates an increased effectiveness in reducing errors arising from the poisoning dataset through optimizing ensemble models. We highlight that the time complexity of TRIM had not been estimated; however, we deduce from their work that TRIM can take exponential time complexity in the worst-case scenario, in excess of Proda's logarithmic time. The performance of both our proposed attack and defense algorithms is extensively evaluated on four real-world datasets of housing prices, loans, health care, and bike sharing services. We hope that our work will inspire future research to develop more robust learning algorithms immune to poisoning attacks.

Related papers

Adversarial Training for Defense Against Label Poisoning Attacks [53.893792844055106]
Label poisoning attacks pose significant risks to machine learning models. We propose a novel adversarial training defense strategy based on support vector machines (SVMs) to counter these threats. Our approach accommodates various model architectures and employs a projected gradient descent algorithm with kernel SVMs for adversarial training.
arXiv Detail & Related papers (2025-02-24T13:03:19Z)
Keeping up with dynamic attackers: Certifying robustness to adaptive online data poisoning [20.44830200702146]
The rise of foundation models fine-tuned on human feedback has increased the risk of adversarial data poisoning. We present a novel framework for computing certified bounds on the impact of dynamic poisoning. We use these certificates to design robust learning algorithms.
arXiv Detail & Related papers (2025-02-23T22:40:56Z)
Defending Against Neural Network Model Inversion Attacks via Data Poisoning [15.099559883494475]
Model inversion attacks pose a significant privacy threat to machine learning models. This paper introduces a novel defense mechanism to better balance privacy and utility. We propose a strategy that leverages data poisoning to contaminate the training data of inversion models.
arXiv Detail & Related papers (2024-12-10T15:08:56Z)
Data Poisoning and Leakage Analysis in Federated Learning [10.090442512374661]
Data poisoning and leakage risks impede the massive deployment of federated learning in the real world. This chapter reveals the truths and pitfalls of understanding two dominating threats: em training data privacy intrusion and em training data poisoning
arXiv Detail & Related papers (2024-09-19T16:50:29Z)
Have You Poisoned My Data? Defending Neural Networks against Data Poisoning [0.393259574660092]
We propose a novel approach to detect and filter poisoned datapoints in the transfer learning setting. We show that effective poisons can be successfully differentiated from clean points in the characteristic vector space. Our evaluation shows that our proposal outperforms existing approaches in defense rate and final trained model performance.
arXiv Detail & Related papers (2024-03-20T11:50:16Z)
Privacy-Preserving Distributed Learning for Residential Short-Term Load Forecasting [11.185176107646956]
Power system load data can inadvertently reveal the daily routines of residential users, posing a risk to their property security. We introduce a Markovian Switching-based distributed training framework, the convergence of which is substantiated through rigorous theoretical analysis. Case studies employing real-world power system load data validate the efficacy of our proposed algorithm.
arXiv Detail & Related papers (2024-02-02T16:39:08Z)
On Practical Aspects of Aggregation Defenses against Data Poisoning Attacks [58.718697580177356]
Attacks on deep learning models with malicious training samples are known as data poisoning. Recent advances in defense strategies against data poisoning have highlighted the effectiveness of aggregation schemes in achieving certified poisoning robustness. Here we focus on Deep Partition Aggregation, a representative aggregation defense, and assess its practical aspects, including efficiency, performance, and robustness.
arXiv Detail & Related papers (2023-06-28T17:59:35Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks. This work proposes a data selection strategy to be applied in the mini-batch training. The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z)
RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target. RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead. Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z)
Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning [32.976199681542845]
We provide a comprehensive systematization of poisoning attacks and defenses in machine learning. We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly. We argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities.
arXiv Detail & Related papers (2022-05-04T11:00:26Z)
Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects. Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z)
How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality. We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers. Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z)
Data Poisoning Attacks on Regression Learning and Corresponding Defenses [0.0]
Adversarial data poisoning is an effective attack against machine learning and threatens model integrity by introducing poisoned data into the training dataset. We present realistic scenarios in which data poisoning attacks threaten production systems and introduce a novel black-box attack. As a result, we observe that the mean squared error (MSE) of the regressor increases to 150 percent due to inserting only two percent of poison samples.
arXiv Detail & Related papers (2020-09-15T12:14:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.