Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications
- URL: http://arxiv.org/abs/2403.02817v1
- Date: Tue, 5 Mar 2024 09:37:13 GMT
- Title: Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications
- Authors: Stav Cohen, Ron Bitton, Ben Nassi,
- Abstract summary: Morris II is the first worm designed to target GenAI ecosystems through the use of adversarial self-replicating prompts.
We demonstrate the application of Morris II against GenAIpowered email assistants in two use cases.
- Score: 6.904930679944526
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the past year, numerous companies have incorporated Generative AI (GenAI) capabilities into new and existing applications, forming interconnected Generative AI (GenAI) ecosystems consisting of semi/fully autonomous agents powered by GenAI services. While ongoing research highlighted risks associated with the GenAI layer of agents (e.g., dialog poisoning, membership inference, prompt leaking, jailbreaking), a critical question emerges: Can attackers develop malware to exploit the GenAI component of an agent and launch cyber-attacks on the entire GenAI ecosystem? This paper introduces Morris II, the first worm designed to target GenAI ecosystems through the use of adversarial self-replicating prompts. The study demonstrates that attackers can insert such prompts into inputs that, when processed by GenAI models, prompt the model to replicate the input as output (replication), engaging in malicious activities (payload). Additionally, these inputs compel the agent to deliver them (propagate) to new agents by exploiting the connectivity within the GenAI ecosystem. We demonstrate the application of Morris II against GenAIpowered email assistants in two use cases (spamming and exfiltrating personal data), under two settings (black-box and white-box accesses), using two types of input data (text and images). The worm is tested against three different GenAI models (Gemini Pro, ChatGPT 4.0, and LLaVA), and various factors (e.g., propagation rate, replication, malicious activity) influencing the performance of the worm are evaluated.
Related papers
- Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection [15.146203784334086]
We propose a generative active adaptation framework that minimizes labeling effort while enhancing model robustness.
We evaluate our end-to-end framework on both simulated IDS data and a real-world ISP dataset.
Our framework effectively enhances rare attack detection while reducing labeling costs, making it a scalable and adaptive solution for real-world intrusion detection.
arXiv Detail & Related papers (2025-03-04T21:49:42Z) - The Roles of Generative Artificial Intelligence in Internet of Electric Vehicles [65.14115295214636]
We specifically consider Internet of electric vehicles (IoEV) and we categorize GenAI for IoEV into four different layers.
We introduce various GenAI techniques used in each layer of IoEV applications.
Public datasets available for training the GenAI models are summarized.
arXiv Detail & Related papers (2024-09-24T05:12:10Z) - Unleashing Worms and Extracting Data: Escalating the Outcome of Attacks against RAG-based Inference in Scale and Severity Using Jailbreaking [6.904930679944526]
We show that with the ability to jailbreak a GenAI model, attackers can escalate the outcome of attacks against RAG-based applications.
In the first part of the paper, we show that attackers can escalate RAG membership inference attacks to RAG documents extraction attacks.
In the second part of the paper, we show that attackers can escalate the scale of RAG data poisoning attacks from compromising a single application to compromising the entire GenAI ecosystem.
arXiv Detail & Related papers (2024-09-12T13:50:22Z) - A Jailbroken GenAI Model Can Cause Substantial Harm: GenAI-powered Applications are Vulnerable to PromptWares [6.904930679944526]
We show that a jailbroken GenAI model can cause substantial harm to GenAI-powered applications.
We present PromptWare, a new type of attack that flips the GenAI model's behavior from serving an application to attacking it.
arXiv Detail & Related papers (2024-08-09T13:32:50Z) - Generative artificial intelligence in dentistry: Current approaches and future challenges [0.0]
generative AI (GenAI) models bridge the usability gap of AI by providing a natural language interface to interact with complex models.
In dental education, the student now has the opportunity to solve a plethora of questions by only prompting a GenAI model.
GenAI can also be used in dental research, where the applications range from new drug discovery to assistance in academic writing.
arXiv Detail & Related papers (2024-07-24T03:33:47Z) - AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases [73.04652687616286]
We propose AgentPoison, the first backdoor attack targeting generic and RAG-based LLM agents by poisoning their long-term memory or RAG knowledge base.
Unlike conventional backdoor attacks, AgentPoison requires no additional model training or fine-tuning.
On each agent, AgentPoison achieves an average attack success rate higher than 80% with minimal impact on benign performance.
arXiv Detail & Related papers (2024-07-17T17:59:47Z) - BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models [57.5404308854535]
Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of unsafe behaviors while evading detection during normal interactions.
We present BEEAR, a mitigation approach leveraging the insight that backdoor triggers induce relatively uniform drifts in the model's embedding space.
Our bi-level optimization method identifies universal embedding perturbations that elicit unwanted behaviors and adjusts the model parameters to reinforce safe behaviors against these perturbations.
arXiv Detail & Related papers (2024-06-24T19:29:47Z) - Genetic Auto-prompt Learning for Pre-trained Code Intelligence Language Models [54.58108387797138]
We investigate the effectiveness of prompt learning in code intelligence tasks.
Existing automatic prompt design methods are very limited to code intelligence tasks.
We propose Genetic Auto Prompt (GenAP) which utilizes an elaborate genetic algorithm to automatically design prompts.
arXiv Detail & Related papers (2024-03-20T13:37:00Z) - At the Dawn of Generative AI Era: A Tutorial-cum-Survey on New Frontiers
in 6G Wireless Intelligence [11.847999494242387]
Generative AI (GenAI) pertains to generative models (GMs) capable of discerning the underlying data distribution, patterns, and features of the input data.
This makes GenAI a crucial asset in wireless domain wherein real-world data is often scarce, incomplete, costly to acquire, and hard to model or comprehend.
We outline the central role of GMs in pioneering areas of 6G network research, including semantic/THz/near-field communications, ISAC, extremely large antenna arrays, digital twins, AI-generated content services, mobile edge computing and edge AI, adversarial ML, and trustworthy
arXiv Detail & Related papers (2024-02-02T06:23:25Z) - Prompt Smells: An Omen for Undesirable Generative AI Outputs [4.105236597768038]
We propose two new concepts that will aid the research community in addressing limitations associated with the application of GenAI models.
First, we propose a definition for the "desirability" of GenAI outputs and three factors which are observed to influence it.
Second, drawing inspiration from Martin Fowler's code smells, we propose the concept of "prompt smells" and the adverse effects they are observed to have on the desirability of GenAI outputs.
arXiv Detail & Related papers (2024-01-23T10:10:01Z) - Efficient Trigger Word Insertion [9.257916713112945]
Our main objective is to reduce the number of poisoned samples while still achieving a satisfactory Attack Success Rate (ASR) in text backdoor attacks.
We propose an efficient trigger word insertion strategy in terms of trigger word optimization and poisoned sample selection.
Our approach achieves an ASR of over 90% with only 10 poisoned samples in the dirty-label setting and requires merely 1.5% of the training data in the clean-label setting.
arXiv Detail & Related papers (2023-11-23T12:15:56Z) - Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception.
We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception.
We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z) - Identifying and Mitigating the Security Risks of Generative AI [179.2384121957896]
This paper reports the findings of a workshop held at Google on the dual-use dilemma posed by GenAI.
GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks.
We discuss short-term and long-term goals for the community on this topic.
arXiv Detail & Related papers (2023-08-28T18:51:09Z) - Improved Activation Clipping for Universal Backdoor Mitigation and
Test-Time Detection [27.62279831135902]
Deep neural networks are vulnerable toTrojan attacks, where an attacker poisons the training set with backdoor triggers.
Recent work shows that backdoor poisoning induces over-fitting (abnormally large activations) in the attacked model.
We devise a new such approach, choosing the activation bounds to explicitly limit classification margins.
arXiv Detail & Related papers (2023-08-08T22:47:39Z) - BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models [54.19289900203071]
The rise in popularity of text-to-image generative artificial intelligence has attracted widespread public interest.
We demonstrate that this technology can be attacked to generate content that subtly manipulates its users.
We propose a Backdoor Attack on text-to-image Generative Models (BAGM)
Our attack is the first to target three popular text-to-image generative models across three stages of the generative process.
arXiv Detail & Related papers (2023-07-31T08:34:24Z) - IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks [45.81957796169348]
Backdoor attacks are an insidious security threat against machine learning models.
We introduce IMBERT, which uses either gradients or self-attention scores derived from victim models to self-defend against backdoor attacks.
Our empirical studies demonstrate that IMBERT can effectively identify up to 98.5% of inserted triggers.
arXiv Detail & Related papers (2023-05-25T22:08:57Z) - GenNI: Human-AI Collaboration for Data-Backed Text Generation [102.08127062293111]
Table2Text systems generate textual output based on structured data utilizing machine learning.
GenNI (Generation Negotiation Interface) is an interactive visual system for high-level human-AI collaboration in producing descriptive text.
arXiv Detail & Related papers (2021-10-19T18:07:07Z) - The Feasibility and Inevitability of Stealth Attacks [63.14766152741211]
We study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence systems.
In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself.
arXiv Detail & Related papers (2021-06-26T10:50:07Z) - Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning [95.60856995067083]
This work is among the first to perform adversarial defense for ASV without knowing the specific attack algorithms.
We propose to perform adversarial defense from two perspectives: 1) adversarial perturbation purification and 2) adversarial perturbation detection.
Experimental results show that our detection module effectively shields the ASV by detecting adversarial samples with an accuracy of around 80%.
arXiv Detail & Related papers (2021-06-01T07:10:54Z) - Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one.
We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models.
Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z) - Adversarial vs behavioural-based defensive AI with joint, continual and
active learning: automated evaluation of robustness to deception, poisoning
and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security.
In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.