Fugu-MT 論文翻訳(概要): RoboView-Bias: Benchmarking Visual Bias in Embodied Agents for Robotic Manipulation

論文の概要: RoboView-Bias: Benchmarking Visual Bias in Embodied Agents for Robotic Manipulation

arxiv url: http://arxiv.org/abs/2509.22356v1
Date: Fri, 26 Sep 2025 13:53:25 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-29 20:57:54.470084
Title: RoboView-Bias: Benchmarking Visual Bias in Embodied Agents for Robotic Manipulation
Title（参考訳）: RoboView-Bias: ロボットマニピュレーションのための身体エージェントにおける視覚バイアスのベンチマーク
Authors: Enguang Liu, Siyuan Liang, Liming Lu, Xiyu Zeng, Xiaochun Cao, Aishan Liu, Shuchao Pang,
Abstract要約: ロボット操作における視覚バイアスの定量化を目的とした,最初のベンチマークであるRoboView-Biasを提案する。我々は、個々の視覚的要因とその相互作用によって引き起こされるバイアスの堅牢な測定を可能にする2,127のタスクインスタンスを作成します。本研究は,視覚バイアスの系統的解析が,安全で信頼性の高い汎用的なエンボディエージェントの開発に必須であることを示す。
参考スコア（独自算出の注目度）: 67.38036090822982
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The safety and reliability of embodied agents rely on accurate and unbiased visual perception. However, existing benchmarks mainly emphasize generalization and robustness under perturbations, while systematic quantification of visual bias remains scarce. This gap limits a deeper understanding of how perception influences decision-making stability. To address this issue, we propose RoboView-Bias, the first benchmark specifically designed to systematically quantify visual bias in robotic manipulation, following a principle of factor isolation. Leveraging a structured variant-generation framework and a perceptual-fairness validation protocol, we create 2,127 task instances that enable robust measurement of biases induced by individual visual factors and their interactions. Using this benchmark, we systematically evaluate three representative embodied agents across two prevailing paradigms and report three key findings: (i) all agents exhibit significant visual biases, with camera viewpoint being the most critical factor; (ii) agents achieve their highest success rates on highly saturated colors, indicating inherited visual preferences from underlying VLMs; and (iii) visual biases show strong, asymmetric coupling, with viewpoint strongly amplifying color-related bias. Finally, we demonstrate that a mitigation strategy based on a semantic grounding layer substantially reduces visual bias by approximately 54.5\% on MOKA. Our results highlight that systematic analysis of visual bias is a prerequisite for developing safe and reliable general-purpose embodied agents.
Abstract（参考訳）: エンボディエージェントの安全性と信頼性は、正確で偏見のない視覚的知覚に依存している。しかし、既存のベンチマークは主に摂動下での一般化と堅牢性を強調しているが、視覚バイアスの体系的な定量化は依然として少ない。このギャップは、知覚が意思決定の安定性にどのように影響するかの深い理解を制限する。この問題を解決するために,ロボット操作における視覚バイアスを定量的に定量化するための最初のベンチマークであるRoboView-Biasを提案する。構造化可変世代フレームワークと知覚フェアネス検証プロトコルを活用することで、個々の視覚的要因とその相互作用によって引き起こされるバイアスの堅牢な測定を可能にする2,127のタスクインスタンスを作成する。このベンチマークを用いて、2つの主要なパラダイムにまたがる3つの代表的エンボディエージェントを体系的に評価し、3つの重要な知見を報告する。 (i)全てのエージェントは、カメラ視点が最も重要な要因である、顕著な視覚バイアスを呈する。 (二)VLMから受け継いだ視覚的嗜好を示す、高度に飽和した色で最高の成功率を達成する。 3)視覚バイアスは強い非対称な結合を示し,視点は色関連バイアスを強く増幅する。最後に,意味的接地層に基づく緩和戦略が,Mokaの視覚バイアスを約54.5\%減少させることを示す。本研究は,視覚バイアスの系統的解析が,安全で信頼性の高い汎用的なエンボディエージェントの開発に必須であることを示す。

論文の概要: RoboView-Bias: Benchmarking Visual Bias in Embodied Agents for Robotic Manipulation

関連論文リスト