Fugu-MT 論文翻訳(概要): MOTOR-Bench: A Real-world Dataset and Multi-agent Framework for Zero-shot Human Mental State Understanding

論文の概要: MOTOR-Bench: A Real-world Dataset and Multi-agent Framework for Zero-shot Human Mental State Understanding

arxiv url: http://arxiv.org/abs/2605.09703v1
Date: Sun, 10 May 2026 18:51:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:50.381688
Title: MOTOR-Bench: A Real-world Dataset and Multi-agent Framework for Zero-shot Human Mental State Understanding
Title（参考訳）: MOTOR-Bench: ゼロショット人間のメンタル状態理解のための実世界のデータセットとマルチエージェントフレームワーク
Authors: Xiaoyu Yuan, Niklas Heikkala, Tiina Törmänen, Hanna Järvenoja, Guoying Zhao, Haoyu Chen,
Abstract要約: 我々はMOTOR-MASというマルチエージェント・フレームワークを提案する。構成されたエージェント調整機構を通じて複数のエージェントを調整し、明示的な行動、内的認知、心理的感情を推測する。実験の結果,MOTOR-MASは,行動,認知,感情の3つのラベルに対して,マクロF1スコアで15.93ポイント,内部認知予測では10.2ポイント,一般マルチエージェントベンチマークでは10.2ポイントにおいて,最高のシングルモデルベンチマークよりも優れていた。
参考スコア（独自算出の注目度）: 17.083382686596494
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding human mental states from natural behavior is crucial for intelligent systems in the real world. However, most current research focuses on predicting isolated mental state labels, lacking structured annotations of complex interpersonal interactions. To support structured analysis, we introduce MOTOR-Bench, a carefully-designed benchmark with a real-world dataset MOTOR-dataset, containing 1,440 multimodal video clips in collaborative learning scenarios, reflecting key real-world data challenges including natural class imbalance, visual noise, and domain-specific language. Each sample is labeled by educational experts based on self-regulated learning theory. We further evaluate several state-of-the-art multimodal large language models and multi-agent systems in a zero-shot setting on our MOTOR-Bench. However, their performance on this task remains limited, suggesting that existing methods still struggle with structured reasoning from observable behavior to deeper mental states. To address this challenge, we propose a reasoning multi-agent framework, named MOTOR-MAS. It coordinates multiple agents through a structured agent coordination mechanism to infer explicit behaviors, internal cognitions, and psychological emotions. Experimental results show that our MOTOR-MAS outperforms the best single-model benchmark by 15.93 points in Macro-F1 scores for the three labels of behavior, cognition, and emotion, and outperforms the general multi-agent benchmark by 10.2 points in internal cognition prediction.
Abstract（参考訳）: 人間の精神状態を自然な行動から理解することは、現実世界の知的なシステムにとって不可欠である。しかし、近年のほとんどの研究は、複雑な対人相互作用の構造化アノテーションが欠如している、孤立した精神状態ラベルの予測に焦点を当てている。構造化解析をサポートするために,MOTOR-Benchは実世界のデータセットであるMOTOR-datasetで慎重に設計されたベンチマークであり,協調学習シナリオにおける1,440のマルチモーダルビデオクリップを含み,自然クラス不均衡,視覚ノイズ,ドメイン固有言語などの重要な実世界のデータ課題を反映している。各サンプルは、自己規制学習理論に基づく教育専門家によってラベル付けされる。さらに,MOTOR-Bench上のゼロショット設定において,最先端のマルチモーダル言語モデルとマルチエージェントシステムについて検討した。しかし、この課題における彼らのパフォーマンスは依然として限られており、既存の手法は観測可能な行動から深い精神状態への構造化推論に苦慮している。この課題に対処するため,MOTOR-MASというマルチエージェント・フレームワークを提案する。構成されたエージェント調整機構を通じて複数のエージェントを調整し、明示的な行動、内的認知、心理的感情を推測する。実験の結果,MOTOR-MASは,行動,認知,感情の3つのラベルに対して,マクロF1スコアで15.93ポイント,内部認知予測では10.2ポイント,一般マルチエージェントベンチマークでは10.2ポイントにおいて,最高のシングルモデルベンチマークよりも優れていた。

論文の概要: MOTOR-Bench: A Real-world Dataset and Multi-agent Framework for Zero-shot Human Mental State Understanding

関連論文リスト