dataRLsec: Safety, Security, and Reliability With Robust Offline Reinforcement Learning for DPAs
- URL: http://arxiv.org/abs/2601.01289v1
- Date: Sat, 03 Jan 2026 21:28:17 GMT
- Title: dataRLsec: Safety, Security, and Reliability With Robust Offline Reinforcement Learning for DPAs
- Authors: Shriram KS Pandian, Naresh Kshetri,
- Abstract summary: Data poisoning attacks (DPAs) are becoming popular as artificial intelligence (AI) algorithms, machine learning (ML) algorithms, and deep learning (DL) algorithms in this artificial intelligence (AI) era.<n>Hackers and testers are injecting malicious contents in the training data (and in testing data too) that leads to false results that are very hard to inspect and predict.<n>We have analyzed several recent technologies used for the DPAs and their safety, security, & countermeasures.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data poisoning attacks (DPAs) are becoming popular as artificial intelligence (AI) algorithms, machine learning (ML) algorithms, and deep learning (DL) algorithms in this artificial intelligence (AI) era. Hackers and penetration testers are excessively injecting malicious contents in the training data (and in testing data too) that leads to false results that are very hard to inspect and predict. We have analyzed several recent technologies used (from deep reinforcement learning to federated learning) for the DPAs and their safety, security, & countermeasures. The problem setup along with the problem estimation is shown in the MuJoCo environment with performance of HalfCheetah before the dataset is poisoned and after the dataset is poisoned. We have analyzed several risks associated with the DPAs and falsification in medical data from popular poisoning data attacks to some popular data defenses. We have proposed robust offline reinforcement learning (Offline RL) for the safety and reliability with weighted hash verification along with density-ratio weighted behavioral cloning (DWBC) algorithm. The four stages of the proposed algorithm (as the Stage 0, the Stage 1, the Stage 2, and the Stage 3) are described with respect to offline RL, safety, and security for DPAs. The conclusion and future scope are provided with the intent to combine DWBC with other data defense strategies to counter and protect future contamination cyberattacks.
Related papers
- Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain [82.98626829232899]
Fine-tuning AI agents on data from their own interactions introduces a critical security vulnerability within the AI supply chain.<n>We show that adversaries can easily poison the data collection pipeline to embed hard-to-detect backdoors.
arXiv Detail & Related papers (2025-10-03T12:47:21Z) - Hack Me If You Can: Aggregating AutoEncoders for Countering Persistent Access Threats Within Highly Imbalanced Data [4.619717316983648]
Advanced Persistent Threats (APTs) are sophisticated, targeted cyberattacks designed to gain unauthorized access to systems and remain undetected for extended periods.
We present AE-APT, a deep learning-based tool for APT detection that features a family of AutoEncoder methods ranging from a basic one to a Transformer-based one.
The outcomes showed that AE-APT has significantly higher detection rates compared to its competitors, indicating superior performance in detecting and ranking anomalies.
arXiv Detail & Related papers (2024-06-27T14:45:38Z) - Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Robust Synthetic Data-Driven Detection of Living-Off-the-Land Reverse Shells [14.710331873072146]
Living-off-the-land (LOTL) techniques pose a significant challenge to security operations.<n>We present a robust augmentation framework for cyber defense systems as Security Information and Event Management (SIEM) solutions.
arXiv Detail & Related papers (2024-02-28T13:49:23Z) - DAD++: Improved Data-free Test Time Adversarial Defense [12.606555446261668]
We propose a test time Data-free Adversarial Defense (DAD) containing detection and correction frameworks.
We conduct a wide range of experiments and ablations on several datasets and network architectures to show the efficacy of our proposed approach.
Our DAD++ gives an impressive performance against various adversarial attacks with a minimal drop in clean accuracy.
arXiv Detail & Related papers (2023-09-10T20:39:53Z) - Analyzing Robustness of the Deep Reinforcement Learning Algorithm in
Ramp Metering Applications Considering False Data Injection Attack and
Defense [0.0]
Ramp metering is the act of controlling on-going vehicles to the highway mainlines.
Deep Q-Learning algorithm uses only loop detectors information as inputs in this study.
Model can be applied to almost any ramp metering sites regardless of the road geometries and layouts.
arXiv Detail & Related papers (2023-01-28T00:40:46Z) - Pre-trained Encoders in Self-Supervised Learning Improve Secure and
Privacy-preserving Supervised Learning [63.45532264721498]
Self-supervised learning is an emerging technique to pre-train encoders using unlabeled data.
We perform first systematic, principled measurement study to understand whether and when a pretrained encoder can address the limitations of secure or privacy-preserving supervised learning algorithms.
arXiv Detail & Related papers (2022-12-06T21:35:35Z) - Lethal Dose Conjecture on Data Poisoning [122.83280749890078]
Data poisoning considers an adversary that distorts the training set of machine learning algorithms for malicious purposes.
In this work, we bring to light one conjecture regarding the fundamentals of data poisoning, which we call the Lethal Dose Conjecture.
arXiv Detail & Related papers (2022-08-05T17:53:59Z) - A Data Quarantine Model to Secure Data in Edge Computing [0.0]
Edge computing provides an agile data processing platform for latency-sensitive and communication-intensive applications.
Data integrity attacks can lead to inconsistent data and intrude edge data analytics.
This paper proposes a new concept of data quarantine model to mitigate data integrity attacks by quarantining intruders.
arXiv Detail & Related papers (2021-11-15T11:04:48Z) - Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - With Great Dispersion Comes Greater Resilience: Efficient Poisoning
Attacks and Defenses for Linear Regression Models [28.680562906669216]
We analyze how attackers may interfere with the results of regression learning by poisoning datasets.
Our attack, termed Nopt, can produce larger errors with the same proportion of poisoning data-points.
Our new defense algorithm, termed Proda, demonstrates an increased effectiveness in reducing errors.
arXiv Detail & Related papers (2020-06-21T22:36:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.