SoK: A Systems Perspective on Compound AI Threats and Countermeasures
- URL: http://arxiv.org/abs/2411.13459v1
- Date: Wed, 20 Nov 2024 17:08:38 GMT
- Title: SoK: A Systems Perspective on Compound AI Threats and Countermeasures
- Authors: Sarbartha Banerjee, Prateek Sahu, Mulong Luo, Anjo Vahldiek-Oberwagner, Neeraja J. Yadwadkar, Mohit Tiwari,
- Abstract summary: We discuss different software and hardware attacks applicable to compound AI systems.
We show how combining multiple attack mechanisms can reduce the threat model assumptions required for an isolated attack.
- Score: 3.458371054070399
- License:
- Abstract: Large language models (LLMs) used across enterprises often use proprietary models and operate on sensitive inputs and data. The wide range of attack vectors identified in prior research - targeting various software and hardware components used in training and inference - makes it extremely challenging to enforce confidentiality and integrity policies. As we advance towards constructing compound AI inference pipelines that integrate multiple large language models (LLMs), the attack surfaces expand significantly. Attackers now focus on the AI algorithms as well as the software and hardware components associated with these systems. While current research often examines these elements in isolation, we find that combining cross-layer attack observations can enable powerful end-to-end attacks with minimal assumptions about the threat model. Given, the sheer number of existing attacks at each layer, we need a holistic and systemized understanding of different attack vectors at each layer. This SoK discusses different software and hardware attacks applicable to compound AI systems and demonstrates how combining multiple attack mechanisms can reduce the threat model assumptions required for an isolated attack. Next, we systematize the ML attacks in lines with the Mitre Att&ck framework to better position each attack based on the threat model. Finally, we outline the existing countermeasures for both software and hardware layers and discuss the necessity of a comprehensive defense strategy to enable the secure and high-performance deployment of compound AI systems.
Related papers
- A Unified Hardware-based Threat Detector for AI Accelerators [12.96840649714218]
We design UniGuard, a novel unified and non-intrusive detection methodology to safeguard FPGA-based AI accelerators.
We employ a Time-to-Digital Converter to capture power fluctuations and train a supervised machine learning model to identify various types of threats.
arXiv Detail & Related papers (2023-11-28T10:55:02Z) - Baseline Defenses for Adversarial Attacks Against Aligned Language
Models [109.75753454188705]
Recent work shows that text moderations can produce jailbreaking prompts that bypass defenses.
We look at three types of defenses: detection (perplexity based), input preprocessing (paraphrase and retokenization), and adversarial training.
We find that the weakness of existing discretes for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.
arXiv Detail & Related papers (2023-09-01T17:59:44Z) - A Human-in-the-Middle Attack against Object Detection Systems [4.764637544913963]
We propose a novel hardware attack inspired by Man-in-the-Middle attacks in cryptography.
This attack generates a Universal Adversarial Perturbations (UAP) and injects the perturbation between the USB camera and the detection system.
These findings raise serious concerns for applications of deep learning models in safety-critical systems, such as autonomous driving.
arXiv Detail & Related papers (2022-08-15T13:21:41Z) - Towards Automated Classification of Attackers' TTPs by combining NLP
with ML Techniques [77.34726150561087]
We evaluate and compare different Natural Language Processing (NLP) and machine learning techniques used for security information extraction in research.
Based on our investigations we propose a data processing pipeline that automatically classifies unstructured text according to attackers' tactics and techniques.
arXiv Detail & Related papers (2022-07-18T09:59:21Z) - Attacks, Defenses, And Tools: A Framework To Facilitate Robust AI/ML
Systems [2.5137859989323528]
Software systems are increasingly relying on Artificial Intelligence (AI) and Machine Learning (ML) components.
This paper presents a framework to characterize attacks and weaknesses associated with AI-enabled systems.
arXiv Detail & Related papers (2022-02-18T22:54:04Z) - Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the
Age of AI-NIDS [70.60975663021952]
We study blackbox adversarial attacks on network classifiers.
We argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions.
We show that a continual learning approach is required to study attacker-defender dynamics.
arXiv Detail & Related papers (2021-11-23T23:42:16Z) - The Feasibility and Inevitability of Stealth Attacks [63.14766152741211]
We study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence systems.
In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself.
arXiv Detail & Related papers (2021-06-26T10:50:07Z) - Automating Cyber Threat Hunting Using NLP, Automated Query Generation,
and Genetic Perturbation [8.669461942767098]
We have developed the WILEE system that cyber threat hunting by translating high-level threat descriptions into many possible concrete implementations.
Both the (high-level) abstract and (low-level) concrete implementations are represented using a custom domain specific language.
WILEE uses the implementations along with other logic, also written in the DSL, to automatically generate queries to confirm (or refute) any hypotheses tied to the potential adversarial.
arXiv Detail & Related papers (2021-04-23T13:19:12Z) - Adversarial Attack Attribution: Discovering Attributable Signals in
Adversarial ML Attacks [0.7883722807601676]
Even production systems, such as self-driving cars and ML-as-a-service offerings, are susceptible to adversarial inputs.
Can perturbed inputs be attributed to the methods used to generate the attack?
We introduce the concept of adversarial attack attribution and create a simple supervised learning experimental framework to examine the feasibility of discovering attributable signals in adversarial attacks.
arXiv Detail & Related papers (2021-01-08T08:16:41Z) - Composite Adversarial Attacks [57.293211764569996]
Adversarial attack is a technique for deceiving Machine Learning (ML) models.
In this paper, a new procedure called Composite Adrial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms.
CAA beats 10 top attackers on 11 diverse defenses with less elapsed time.
arXiv Detail & Related papers (2020-12-10T03:21:16Z) - On Adversarial Examples and Stealth Attacks in Artificial Intelligence
Systems [62.997667081978825]
We present a formal framework for assessing and analyzing two classes of malevolent action towards generic Artificial Intelligence (AI) systems.
The first class involves adversarial examples and concerns the introduction of small perturbations of the input data that cause misclassification.
The second class, introduced here for the first time and named stealth attacks, involves small perturbations to the AI system itself.
arXiv Detail & Related papers (2020-04-09T10:56:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.