Related papers: ANNIE: Be Careful of Your Robots

ANNIE: Be Careful of Your Robots

URL: http://arxiv.org/abs/2509.03383v1
Date: Wed, 03 Sep 2025 15:00:28 GMT
Title: ANNIE: Be Careful of Your Robots
Authors: Yiyang Huang, Zixuan Wang, Zishen Wan, Yapeng Tian, Haobo Xu, Yinhe Han, Yiming Gan,
Abstract summary: We present the first systematic study of adversarial safety attacks on embodied AI systems.<n>We show attack success rates exceeding 50% across all safety categories.<n>Results expose a previously underexplored but highly consequential attack surface in embodied AI systems.
Score: 48.89876809734855
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The integration of vision-language-action (VLA) models into embodied AI (EAI) robots is rapidly advancing their ability to perform complex, long-horizon tasks in humancentric environments. However, EAI systems introduce critical security risks: a compromised VLA model can directly translate adversarial perturbations on sensory input into unsafe physical actions. Traditional safety definitions and methodologies from the machine learning community are no longer sufficient. EAI systems raise new questions, such as what constitutes safety, how to measure it, and how to design effective attack and defense mechanisms in physically grounded, interactive settings. In this work, we present the first systematic study of adversarial safety attacks on embodied AI systems, grounded in ISO standards for human-robot interactions. We (1) formalize a principled taxonomy of safety violations (critical, dangerous, risky) based on physical constraints such as separation distance, velocity, and collision boundaries; (2) introduce ANNIEBench, a benchmark of nine safety-critical scenarios with 2,400 video-action sequences for evaluating embodied safety; and (3) ANNIE-Attack, a task-aware adversarial framework with an attack leader model that decomposes long-horizon goals into frame-level perturbations. Our evaluation across representative EAI models shows attack success rates exceeding 50% across all safety categories. We further demonstrate sparse and adaptive attack strategies and validate the real-world impact through physical robot experiments. These results expose a previously underexplored but highly consequential attack surface in embodied AI systems, highlighting the urgent need for security-driven defenses in the physical AI era. Code is available at https://github.com/RLCLab/Annie.

Related papers

What Breaks Embodied AI Security:LLM Vulnerabilities, CPS Flaws,or Something Else? [28.12412876058788]
Embodied AI systems are rapidly transitioning from controlled environments to safety critical real-world deployments.<n>Unlike disembodied AI, failures in embodied intelligence lead to irreversible physical consequences.<n>We argue that a significant class of failures arises from embodiment-induced system-level mismatches.
arXiv Detail & Related papers (2026-02-19T13:29:00Z)
Can AI Perceive Physical Danger and Intervene? [16.825608691806988]
New safety challenges emerge when AI interacts with the physical world.<n>How well do state-of-the-art foundation models understand common-sense facts about physical safety?
arXiv Detail & Related papers (2025-09-25T22:09:17Z)
Preventing Adversarial AI Attacks Against Autonomous Situational Awareness: A Maritime Case Study [0.0]
Adrial artificial intelligence (AI) attacks pose a significant threat to autonomous transportation.<n>This paper addresses three critical research challenges associated with adversarial AI.<n>We propose building defences utilising multiple inputs and data fusion to create defensive components.
arXiv Detail & Related papers (2025-05-27T17:59:05Z)
CEE: An Inference-Time Jailbreak Defense for Embodied Intelligence via Subspace Concept Rotation [23.07221882519171]
Large Language Models (LLMs) are increasingly becoming the cognitive core of Embodied Intelligence (EI) systems.<n>We propose a novel and efficient inference-time defense framework: Concept Enhancement Engineering (CEE)<n>CEE enhances the model's inherent safety mechanisms by directly manipulating its internal representations.
arXiv Detail & Related papers (2025-04-15T03:50:04Z)
An Approach to Technical AGI Safety and Security [72.83728459135101]
We develop an approach to address the risk of harms consequential enough to significantly harm humanity.<n>We focus on technical approaches to misuse and misalignment.<n>We briefly outline how these ingredients could be combined to produce safety cases for AGI systems.
arXiv Detail & Related papers (2025-04-02T15:59:31Z)
Towards Robust and Secure Embodied AI: A Survey on Vulnerabilities and Attacks [22.154001025679896]
Embodied AI systems, including robots and autonomous vehicles, are increasingly integrated into real-world applications.<n>These vulnerabilities manifest through sensor spoofing, adversarial attacks, and failures in task and motion planning.
arXiv Detail & Related papers (2025-02-18T03:38:07Z)
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics [68.36528819227641]
This paper systematically evaluates the robustness of Vision-Language-Action (VLA) models.<n>We introduce two untargeted attack objectives that leverage spatial foundations to destabilize robotic actions, and a targeted attack objective that manipulates the robotic trajectory.<n>We design an adversarial patch generation approach that places a small, colorful patch within the camera's view, effectively executing the attack in both digital and physical environments.
arXiv Detail & Related papers (2024-11-18T01:52:20Z)
Defining and Evaluating Physical Safety for Large Language Models [62.4971588282174]
Large Language Models (LLMs) are increasingly used to control robotic systems such as drones. Their risks of causing physical threats and harm in real-world applications remain unexplored. We classify the physical safety risks of drones into four categories: (1) human-targeted threats, (2) object-targeted threats, (3) infrastructure attacks, and (4) regulatory violations.
arXiv Detail & Related papers (2024-11-04T17:41:25Z)
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI [52.138044013005]
generative AI, particularly large language models (LLMs), become increasingly integrated into production applications. New attack surfaces and vulnerabilities emerge and put a focus on adversarial threats in natural language and multi-modal systems. Red-teaming has gained importance in proactively identifying weaknesses in these systems, while blue-teaming works to protect against such adversarial attacks. This work aims to bridge the gap between academic insights and practical security measures for the protection of generative AI systems.
arXiv Detail & Related papers (2024-09-23T10:18:10Z)
EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z)
Security Considerations in AI-Robotics: A Survey of Current Methods, Challenges, and Opportunities [4.466887678364242]
Motivated by the need to address the security concerns in AI-Robotics systems, this paper presents a comprehensive survey and taxonomy across three dimensions. We begin by surveying potential attack surfaces and provide mitigating defensive strategies. We then delve into ethical issues, such as dependency and psychological impact, as well as the legal concerns regarding accountability for these systems.
arXiv Detail & Related papers (2023-10-12T17:54:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.