Fugu-MT 論文翻訳(概要): HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models

論文の概要: HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models

arxiv url: http://arxiv.org/abs/2604.12447v1
Date: Tue, 14 Apr 2026 08:32:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-15 19:11:32.344297
Title: HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models
Title（参考訳）: HazardArena:ビジョンランゲージアクションモデルにおけるセマンティック安全性の評価
Authors: Zixing Chen, Yifeng Gao, Li Wang, Yunhan Zhao, Yi Liu, Jiayu Li, Xiang Zheng, Zuxuan Wu, Cong Wang, Xingjun Ma, Yu-Gang Jiang,
Abstract要約: 本研究では,視覚・言語・アクションモデルのセマンティック安全性を評価するためのベンチマークであるHazardArenaを紹介する。安全シナリオに特化してトレーニングされたVLAモデルは、対応する安全でないシナリオで評価された場合、安全に動作しないことが多い。本研究では,セマンティック属性や視覚言語判断を用いた行動実行を制約する,トレーニングフリーの安全オプション層を提案する。
参考スコア（独自算出の注目度）: 87.35765363039638
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision-Language-Action (VLA) models inherit rich world knowledge from vision-language backbones and acquire executable skills via action demonstrations. However, existing evaluations largely focus on action execution success, leaving action policies loosely coupled with visual-linguistic semantics. This decoupling exposes a systematic vulnerability whereby correct action execution may induce unsafe outcomes under semantic risk. To expose this vulnerability, we introduce HazardArena, a benchmark designed to evaluate semantic safety in VLAs under controlled yet risk-bearing contexts. HazardArena is constructed from safe/unsafe twin scenarios that share matched objects, layouts, and action requirements, differing only in the semantic context that determines whether an action is unsafe. We find that VLA models trained exclusively on safe scenarios often fail to behave safely when evaluated in their corresponding unsafe counterparts. HazardArena includes over 2,000 assets and 40 risk-sensitive tasks spanning 7 real-world risk categories grounded in established robotic safety standards. To mitigate this vulnerability, we propose a training-free Safety Option Layer that constrains action execution using semantic attributes or a vision-language judge, substantially reducing unsafe behaviors with minimal impact on task performance. We hope that HazardArena highlights the need to rethink how semantic safety is evaluated and enforced in VLAs as they scale toward real-world deployment.
Abstract（参考訳）: VLA(Vision-Language-Action)モデルは、視覚言語バックボーンから豊かな世界知識を継承し、アクションデモを通じて実行可能なスキルを取得する。しかし、既存の評価は主にアクション実行の成功に焦点を当てており、アクションポリシーは視覚言語的な意味論と疎結合である。この分離は、適切なアクション実行がセマンティックリスクの下で安全でない結果を引き起こすという、系統的な脆弱性を露呈する。この脆弱性を明らかにするために、我々は、制御されているがリスクを持つコンテキスト下でVLAのセマンティック安全性を評価するために設計されたベンチマークであるHazardArenaを紹介した。 HazardArenaは、一致したオブジェクト、レイアウト、アクション要求を共有する安全/安全でないツインシナリオで構成されており、アクションが安全でないかどうかを決定するセマンティックコンテキストでのみ異なる。安全シナリオに特化してトレーニングされたVLAモデルは、対応する安全でないシナリオで評価された場合、安全に動作しないことが多い。 HazardArenaには2000以上の資産と、確立されたロボット安全基準に基づく7つの現実世界のリスクカテゴリにまたがる40のリスクに敏感なタスクが含まれている。この脆弱性を軽減するために、セマンティック属性や視覚的判断を用いたアクション実行を制限し、タスクパフォーマンスに最小限の影響を伴って、安全でない振る舞いを著しく低減する、トレーニング不要のセーフティ・オプション・レイヤを提案する。 HazardArenaが、現実のデプロイメントに向けてスケールするVLAにおいて、セマンティック安全性をどのように評価し、強制するかを再考する必要性を強調していることを願っています。

論文の概要: HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models

関連論文リスト