TrojanRobot: Backdoor Attacks Against Robotic Manipulation in the Physical World
- URL: http://arxiv.org/abs/2411.11683v1
- Date: Mon, 18 Nov 2024 16:09:26 GMT
- Title: TrojanRobot: Backdoor Attacks Against Robotic Manipulation in the Physical World
- Authors: Xianlong Wang, Hewen Pan, Hangtao Zhang, Minghui Li, Shengshan Hu, Ziqi Zhou, Lulu Xue, Peijin Guo, Yichen Wang, Wei Wan, Aishan Liu, Leo Yu Zhang,
- Abstract summary: We propose a backdoor attack specifically targeting robotic manipulation and, for the first time, implementing backdoor attack in the physical world.
By embedding a backdoor visual language model into the visual perception module within the robotic system, we successfully mislead the robotic arm's operation in the physical world.
- Score: 22.313765935846046
- License:
- Abstract: Robotic manipulation refers to the autonomous handling and interaction of robots with objects using advanced techniques in robotics and artificial intelligence. The advent of powerful tools such as large language models (LLMs) and large vision-language models (LVLMs) has significantly enhanced the capabilities of these robots in environmental perception and decision-making. However, the introduction of these intelligent agents has led to security threats such as jailbreak attacks and adversarial attacks. In this research, we take a further step by proposing a backdoor attack specifically targeting robotic manipulation and, for the first time, implementing backdoor attack in the physical world. By embedding a backdoor visual language model into the visual perception module within the robotic system, we successfully mislead the robotic arm's operation in the physical world, given the presence of common items as triggers. Experimental evaluations in the physical world demonstrate the effectiveness of the proposed backdoor attack.
Related papers
- Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics [70.93622520400385]
This paper systematically quantifies the robustness of VLA-based robotic systems.
We introduce an untargeted position-aware attack objective that leverages spatial foundations to destabilize robotic actions.
We also design an adversarial patch generation approach that places a small, colorful patch within the camera's view, effectively executing the attack in both digital and physical environments.
arXiv Detail & Related papers (2024-11-18T01:52:20Z) - $π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge.
We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z) - Jailbreaking LLM-Controlled Robots [82.04590367171932]
Large language models (LLMs) have revolutionized the field of robotics by enabling contextual reasoning and intuitive human-robot interaction.
LLMs are vulnerable to jailbreaking attacks, wherein malicious prompters elicit harmful text by bypassing LLM safety guardrails.
We introduce RoboPAIR, the first algorithm designed to jailbreak LLM-controlled robots.
arXiv Detail & Related papers (2024-10-17T15:55:36Z) - Unifying 3D Representation and Control of Diverse Robots with a Single Camera [48.279199537720714]
We introduce Neural Jacobian Fields, an architecture that autonomously learns to model and control robots from vision alone.
Our approach achieves accurate closed-loop control and recovers the causal dynamic structure of each robot.
arXiv Detail & Related papers (2024-07-11T17:55:49Z) - WALL-E: Embodied Robotic WAiter Load Lifting with Large Language Model [92.90127398282209]
This paper investigates the potential of integrating the most recent Large Language Models (LLMs) and existing visual grounding and robotic grasping system.
We introduce the WALL-E (Embodied Robotic WAiter load lifting with Large Language model) as an example of this integration.
We deploy this LLM-empowered system on the physical robot to provide a more user-friendly interface for the instruction-guided grasping task.
arXiv Detail & Related papers (2023-08-30T11:35:21Z) - Giving Robots a Hand: Learning Generalizable Manipulation with
Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation.
Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation.
In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z) - Open-World Object Manipulation using Pre-trained Vision-Language Models [72.87306011500084]
For robots to follow instructions from people, they must be able to connect the rich semantic information in human vocabulary.
We develop a simple approach, which leverages a pre-trained vision-language model to extract object-identifying information.
In a variety of experiments on a real mobile manipulator, we find that MOO generalizes zero-shot to a wide range of novel object categories and environments.
arXiv Detail & Related papers (2023-03-02T01:55:10Z) - RoboMal: Malware Detection for Robot Network Systems [4.357338639836869]
We propose the RoboMal framework of static malware detection on binary executables to detect malware before it gets a chance to execute.
The framework is compared against widely used supervised learning models: GRU, CNN, and ANN.
Notably, the LSTM-based RoboMal model outperforms the other models with an accuracy of 85% and precision of 87% in 10-fold cross-validation.
arXiv Detail & Related papers (2022-01-20T22:11:38Z) - Fault-Aware Robust Control via Adversarial Reinforcement Learning [35.16413579212691]
We propose an adversarial reinforcement learning framework, which significantly increases robot fragility over joint damage cases.
We validate our algorithm on a three-fingered robot hand and a quadruped robot.
Our algorithm can be trained only in simulation and directly deployed on a real robot without any fine-tuning.
arXiv Detail & Related papers (2020-11-17T16:01:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.