Related papers: A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment

A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment

URL: http://arxiv.org/abs/2505.21414v1
Date: Tue, 27 May 2025 16:41:23 GMT
Title: A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment
Authors: Brett Bissey, Kyle Gatesman, Walker Dimon, Mohammad Alam, Luis Robaina, Joseph Weissman,
Abstract summary: This paper introduces a framework to analyze and secure decision-support systems trained with Deep Reinforcement Learning (DRL)<n>We validate our framework, visualize agent behavior, and evaluate adversarial outcomes within the context of a custom-built strategic game, CyberStrike.
Score: 0.33928435949901725
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This paper introduces a comprehensive framework designed to analyze and secure decision-support systems trained with Deep Reinforcement Learning (DRL), prior to deployment, by providing insights into learned behavior patterns and vulnerabilities discovered through simulation. The introduced framework aids in the development of precisely timed and targeted observation perturbations, enabling researchers to assess adversarial attack outcomes within a strategic decision-making context. We validate our framework, visualize agent behavior, and evaluate adversarial outcomes within the context of a custom-built strategic game, CyberStrike. Utilizing the proposed framework, we introduce a method for systematically discovering and ranking the impact of attacks on various observation indices and time-steps, and we conduct experiments to evaluate the transferability of adversarial attacks across agent architectures and DRL training algorithms. The findings underscore the critical need for robust adversarial defense mechanisms to protect decision-making policies in high-stakes environments.

Related papers

A Survey on Model Extraction Attacks and Defenses for Large Language Models [55.60375624503877]
Model extraction attacks pose significant security threats to deployed language models.<n>This survey provides a comprehensive taxonomy of extraction attacks and defenses, categorizing attacks into functionality extraction, training data extraction, and prompt-targeted attacks.<n>We examine defense mechanisms organized into model protection, data privacy protection, and prompt-targeted strategies, evaluating their effectiveness across different deployment scenarios.
arXiv Detail & Related papers (2025-06-26T22:02:01Z)
Unveiling the Black Box: A Multi-Layer Framework for Explaining Reinforcement Learning-Based Cyber Agents [4.239727656979701]
We propose a unified, multi-layer explainability framework for RL-based attacker agents.<n>At the MDP level, we model cyberattacks as a Partially Observable Markov Decision Processes (POMDPs)<n>At the policy level, we analyse the temporal evolution of Q-values and use Prioritised Experience Replay (PER) to surface critical learning transitions.
arXiv Detail & Related papers (2025-05-16T21:29:55Z)
Beyond Black-Box Benchmarking: Observability, Analytics, and Optimization of Agentic Systems [1.415098516077151]
The rise of agentic AI systems, where agents collaborate to perform diverse tasks, poses new challenges with observing, analyzing and optimizing their behavior.<n>Traditional evaluation and benchmarking approaches struggle to handle the non-deterministic, context-sensitive, and dynamic nature of these systems.<n>This paper explores key challenges and opportunities in analyzing and optimizing agentic systems across development, testing, and maintenance.
arXiv Detail & Related papers (2025-03-09T20:02:04Z)
A Survey of Model Extraction Attacks and Defenses in Distributed Computing Environments [55.60375624503877]
Model Extraction Attacks (MEAs) threaten modern machine learning systems by enabling adversaries to steal models, exposing intellectual property and training data.<n>This survey is motivated by the urgent need to understand how the unique characteristics of cloud, edge, and federated deployments shape attack vectors and defense requirements.<n>We systematically examine the evolution of attack methodologies and defense mechanisms across these environments, demonstrating how environmental factors influence security strategies in critical sectors such as autonomous vehicles, healthcare, and financial services.
arXiv Detail & Related papers (2025-02-22T03:46:50Z)
Robust Image Classification: Defensive Strategies against FGSM and PGD Adversarial Attacks [0.0]
Adversarial attacks pose significant threats to the robustness of deep learning models in image classification. This paper explores and refines defense mechanisms against these attacks to enhance the resilience of neural networks.
arXiv Detail & Related papers (2024-08-20T02:00:02Z)
LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models [75.89014602596673]
Strategic reasoning requires understanding and predicting adversary actions in multi-agent settings while adjusting strategies accordingly. We explore the scopes, applications, methodologies, and evaluation metrics related to strategic reasoning with Large Language Models. It underscores the importance of strategic reasoning as a critical cognitive capability and offers insights into future research directions and potential improvements.
arXiv Detail & Related papers (2024-04-01T16:50:54Z)
Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks. This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks. We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z)
Adversarial Robustness on Image Classification with $k$-means [3.5385056709199536]
We evaluate the vulnerability of $k$-means clustering algorithms to adversarial attacks, emphasising the associated security risks. We introduce and evaluate an adversarial training method that improves testing performance in adversarial scenarios.
arXiv Detail & Related papers (2023-12-15T04:51:43Z)
Physical Adversarial Attacks For Camera-based Smart Systems: Current Trends, Categorization, Applications, Research Challenges, and Future Outlook [2.1771693754641013]
We aim to provide a thorough understanding of the concept of physical adversarial attacks, analyzing their key characteristics and distinguishing features. Our article delves into various physical adversarial attack methods, categorized according to their target tasks in different applications. We assess the performance of these attack methods in terms of their effectiveness, stealthiness, and robustness.
arXiv Detail & Related papers (2023-08-11T15:02:19Z)
Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A Contemporary Survey [114.17568992164303]
Adrial attacks and defenses in machine learning and deep neural network have been gaining significant attention. This survey provides a comprehensive overview of the recent advancements in the field of adversarial attack and defense techniques. New avenues of attack are also explored, including search-based, decision-based, drop-based, and physical-world attacks.
arXiv Detail & Related papers (2023-03-11T04:19:31Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.