Fugu-MT 論文翻訳(概要): Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

論文の概要: Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

arxiv url: http://arxiv.org/abs/2604.23775v1
Date: Sun, 26 Apr 2026 15:58:19 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:07.541443
Title: Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms
Title（参考訳）: ビジョン・ランゲージ・アクションの安全性:脅威、課題、評価、メカニズム
Authors: Qi Li, Bo Yin, Weiqi Huang, Ruhao Liu, Bojun Zou, Runpeng Yu, Jingwen Ye, Weihao Yu, Xinchao Wang,
Abstract要約: VLA(Vision-Language-Action)モデルがインボディードインテリジェンスのための統一された基盤として登場している。この調査は、ビジョン・ランゲージ・アクションモデルにおける安全性の統一的で最新の概要を提供する。
参考スコア（独自算出の注目度）: 61.29786895012393
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vision-Language-Action (VLA) models are emerging as a unified substrate for embodied intelligence. This shift raises a new class of safety challenges, stemming from the embodied nature of VLA systems, including irreversible physical consequences, a multimodal attack surface across vision, language, and state, real-time latency constraints on defense, error propagation over long-horizon trajectories, and vulnerabilities in the data supply chain. Yet the literature remains fragmented across robotic learning, adversarial machine learning, AI alignment, and autonomous systems safety. This survey provides a unified and up-to-date overview of safety in Vision-Language-Action models. We organize the field along two parallel timing axes, attack timing (training-time vs. inference-time and defense timing (training-time vs. inference-time, linking each class of threat to the stage at which it can be mitigated. We first define the scope of VLA safety, distinguishing it from text-only LLM safety and classical robotic safety, and review the foundations of VLA models, including architectures, training paradigms, and inference mechanisms. We then examine the literature through four lenses: Attacks, Defenses, Evaluation, and Deployment. We survey training-time threats such as data poisoning and backdoors, as well as inference-time attacks including adversarial patches, cross-modal perturbations, semantic jailbreaks, and freezing attacks. We review training-time and runtime defenses, analyze existing benchmarks and metrics, and discuss safety challenges across six deployment domains. Finally, we highlight key open problems, including certified robustness for embodied trajectories, physically realizable defenses, safety-aware training, unified runtime safety architectures, and standardized evaluation.
Abstract（参考訳）: VLA(Vision-Language-Action)モデルがインボディードインテリジェンスのための統一された基盤として登場している。このシフトは、VLAシステムの具現化された性質、可逆的な物理的結果、視覚、言語、状態にわたるマルチモーダル攻撃面、防衛上のリアルタイム遅延制限、長距離軌道上のエラー伝搬、データサプライチェーンの脆弱性など、新しいタイプの安全性上の課題を提起する。しかし、文学はロボット学習、敵対的機械学習、AIアライメント、自律システムの安全性で断片化されている。この調査は、ビジョン・ランゲージ・アクションモデルにおける安全性の統一的で最新の概要を提供する。 2つの並列タイミング軸に沿ってフィールドを編成し、攻撃タイミング(トレーニング時間対推論時間)と防御タイミング(トレーニング時間対推論時間)を調整し、各クラスの脅威を緩和可能なステージにリンクする。まず、VLAの安全性の範囲を定義し、テキストのみのLLM安全性と古典的なロボット安全性とを区別し、アーキテクチャ、トレーニングパラダイム、推論メカニズムを含むVLAモデルの基礎をレビューする。次に、攻撃、防衛、評価、展開の4つのレンズを通して文献を調べます。データ中毒やバックドアなどのトレーニング時間の脅威や、敵のパッチ、モダル間の摂動、セマンティックジェイルブレイク、凍結攻撃などの推論時間攻撃を調査した。トレーニング時間とランタイムのディフェンスをレビューし、既存のベンチマークとメトリクスを分析し、6つのデプロイメントドメインにわたる安全性上の課題について議論する。最後に、具体化された軌道の堅牢性、物理的に実現可能な防御、安全に配慮したトレーニング、統一されたランタイムセーフティアーキテクチャ、標準化された評価など、主要なオープンな課題を強調します。

論文の概要: Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

関連論文リスト