Fugu-MT 論文翻訳(概要): Neuro-Symbolic Manipulation Understanding with Enriched Semantic Event Chains

論文の概要: Neuro-Symbolic Manipulation Understanding with Enriched Semantic Event Chains

arxiv url: http://arxiv.org/abs/2604.21053v1
Date: Wed, 22 Apr 2026 19:53:48 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-24 14:40:06.162087
Title: Neuro-Symbolic Manipulation Understanding with Enriched Semantic Event Chains
Title（参考訳）: リッチセマンティックイベントチェーンを用いたニューロシンボリックマニピュレーションの理解
Authors: Fatemeh Ziaeetabar,
Abstract要約: 我々は、eSECを、理解を操作するための明示的な事象レベルシンボル状態に変換する、ニューロシンボリックなフレームワークであるeSEC-LAMを提案する。本研究では, EPIC-KITCHENS-100, EPIC-KITCHENS VISOR, Assembly101について, 行動認識, 次優先予測, 知覚雑音に対する堅牢性, 説明整合性について検討した。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robotic systems operating in human environments must reason about how object interactions evolve over time, which actions are currently being performed, and what manipulation step is likely to follow. Classical enriched Semantic Event Chains (eSECs) provide an interpretable relational description of manipulation, but remain primarily descriptive and do not directly support uncertainty-aware decision making. In this paper, we propose eSEC-LAM, a neuro-symbolic framework that transforms eSECs into an explicit event-level symbolic state for manipulation understanding. The proposed formulation augments classical eSECs with confidence-aware predicates, functional object roles, affordance priors, primitive-level abstraction, and saliency-guided explanation cues. These enriched symbolic states are derived from a foundation-model-based perception front-end through deterministic predicate extraction, while current-action inference and next-primitive prediction are performed using lightweight symbolic reasoning over primitive pre- and post-conditions. We evaluate the proposed framework on EPIC-KITCHENS-100, EPIC-KITCHENS VISOR, and Assembly101 across action recognition, next-primitive prediction, robustness to perception noise, and explanation consistency. Experimental results show that eSEC-LAM achieves competitive action recognition, substantially improves next-primitive prediction, remains more robust under degraded perceptual conditions than both classical symbolic and end-to-end video baselines, and provides temporally consistent explanation traces grounded in explicit relational evidence. These findings demonstrate that enriched Semantic Event Chains can serve not only as interpretable descriptors of manipulation, but also as effective internal states for neuro-symbolic action reasoning.
Abstract（参考訳）: 人間の環境で動くロボットシステムは、オブジェクトの相互作用が時間とともにどのように進化するか、どのアクションが現在実行され、どの操作ステップが続くか、を推論する必要があります。古典的なリッチなセマンティックイベントチェーン(eSECs)は、操作の解釈可能なリレーショナル記述を提供するが、主に記述的であり、不確実性を認識した意思決定を直接サポートしていない。本稿では,eSECを明示的な事象レベルのシンボル状態に変換し,理解を操作可能にする,ニューロシンボリックなフレームワークであるeSEC-LAMを提案する。提案された定式化は、信頼を意識した述語、機能的なオブジェクトロール、空き先、プリミティブレベルの抽象化、サリエンシに導かれた説明手段を備えた古典的なeSECを強化する。これらの濃密な記号状態は、決定論的述語抽出を通じて基礎モデルに基づく認識フロントエンドから導出され、一方、プリミティブプレコンディションやポストコンディションよりも軽量なシンボル推論を用いて、現在の動作推定と次プライミティブ予測が実行される。本研究では, EPIC-KITCHENS-100, EPIC-KITCHENS VISOR, Assembly101について, 行動認識, 次優先予測, 知覚雑音に対する堅牢性, 説明整合性について検討した。実験の結果、eSEC-LAMは競合行動認識を実現し、次の原始的予測を大幅に改善し、古典的シンボリックとエンド・ツー・エンドの両方のビデオベースラインよりも劣化した知覚条件下では頑健であり、明確な関係性証拠に基づく時間的に一貫した説明の痕跡を提供する。これらの結果から, リッチセマンティックイベントチェーンは, 操作の解釈可能な記述子としてだけでなく, ニューロシンボリック・アクション・推論のための効果的な内部状態としても機能することが示唆された。

論文の概要: Neuro-Symbolic Manipulation Understanding with Enriched Semantic Event Chains

関連論文リスト