Fugu-MT 論文翻訳(概要): The Escalator Problem: Identifying Implicit Motion Blindness in AI for Accessibility

論文の概要: The Escalator Problem: Identifying Implicit Motion Blindness in AI for Accessibility

arxiv url: http://arxiv.org/abs/2508.07989v1
Date: Mon, 11 Aug 2025 13:53:09 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-12 21:23:29.124226
Title: The Escalator Problem: Identifying Implicit Motion Blindness in AI for Accessibility
Title（参考訳）: エスカレーター問題 : アクセシビリティのためのAIにおけるインシシビリティ・モーション・ブラインドネスの同定
Authors: Xiantao Zhang,
Abstract要約: エスカレーター問題(Escalator problem)とは、エスカレーターの走行方向を知覚する最先端のモデルが存在しない問題である。この盲目は、ビデオ理解におけるフレームサンプリングのパラダイムの支配に起因している。我々は、純粋に意味認識から堅牢な物理的知覚へのパラダイムシフトを提唱する。
参考スコア（独自算出の注目度）: 0.9867937058271615
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal Large Language Models (MLLMs) hold immense promise as assistive technologies for the blind and visually impaired (BVI) community. However, we identify a critical failure mode that undermines their trustworthiness in real-world applications. We introduce the Escalator Problem -- the inability of state-of-the-art models to perceive an escalator's direction of travel -- as a canonical example of a deeper limitation we term Implicit Motion Blindness. This blindness stems from the dominant frame-sampling paradigm in video understanding, which, by treating videos as discrete sequences of static images, fundamentally struggles to perceive continuous, low-signal motion. As a position paper, our contribution is not a new model but rather to: (I) formally articulate this blind spot, (II) analyze its implications for user trust, and (III) issue a call to action. We advocate for a paradigm shift from purely semantic recognition towards robust physical perception and urge the development of new, human-centered benchmarks that prioritize safety, reliability, and the genuine needs of users in dynamic environments.
Abstract（参考訳）: MLLM(Multimodal Large Language Models)は、視覚障害者(BVI)コミュニティの補助技術として大きな可能性を秘めている。しかし、現実のアプリケーションにおける信頼性を損なう重要な障害モードを特定します。エスカレーター問題(Escalator problem)とは、エスカレーターの走行方向を知覚する最先端モデルの不備を、インプリシット・モーション・ブラインドネス(Implicit Motion Blindness)という、より深い制限の例として紹介する。この盲目性は、ビデオ理解における支配的なフレームサンプリングパラダイムに起因しており、ビデオを静止画像の離散的なシーケンスとして扱うことで、基本的には連続的、低信号運動を認識するのに苦労する。ポジションペーパーとして、私たちのコントリビューションは新しいモデルではなく、(I)この盲点を正式に表現し、(II)ユーザ信頼に対する影響を分析し、(III)行動を呼び起こす。我々は、純粋に意味認識から堅牢な物理的知覚へのパラダイムシフトを提唱し、動的環境におけるユーザの安全性、信頼性、真のニーズを優先する新しい人間中心のベンチマークの開発を奨励する。

論文の概要: The Escalator Problem: Identifying Implicit Motion Blindness in AI for Accessibility

関連論文リスト