Fugu-MT 論文翻訳(概要): Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

論文の概要: Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

arxiv url: http://arxiv.org/abs/2604.24062v1
Date: Mon, 27 Apr 2026 05:37:53 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:07.75487
Title: Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer
Title（参考訳）: AIがいかに人間から距離を逸脱するか
Authors: Liangru Xiang, Yuxi Ma, Zhihao Cao, Yixin Zhu, Song-Chun Zhu,
Abstract要約: 本研究では,現在最先端のAIモデルが抽象因果構造伝達のためのヒューマンライクなメカニズムを持っているかを検討する。本研究は, 共通原因 (CC) と共通効果 (CE) 構造が根本的に遅延または欠落していることを示す。これらの結果から、大規模統計学習は、人間の類推的推論を支える非文脈化された因果関係を創出しないことが明らかとなった。
参考スコア（独自算出の注目度）: 47.76679214811589
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Extracting abstract causal structures and applying them to novel situations is a hallmark of human intelligence. While Large Language Models (LLMs) and Vision Language Models (VLMs) have shown strong performance on a wide range of reasoning tasks, their capacity for interactive causal learning -- inducing latent structures through sequential exploration and transferring them across contexts -- remains uncharacterized. Human learners accomplish such transfer after minimal exposure, whereas classical Reinforcement Learning (RL) agents fail catastrophically. Whether state-of-the-art Artificial Intelligence (AI) models possess human-like mechanisms for abstract causal structure transfer is an open question. Using the OpenLock paradigm requiring sequential discovery of Common Cause (CC) and Common Effect (CE) structures, here we show that models exhibit fundamentally delayed or absent transfer: even successful models require initial environmental-specific mapping -- what we term environmental grounding -- before efficiency gains emerge, whereas humans leverage prior structural knowledge from the very first solution attempt. In the text-only condition, models matched or exceeded human discovery efficiency. In contrast, visual information -- in both the image-only and text-and-image conditions -- overall degraded rather than enhanced performance, revealing a broad reliance on symbolic processing rather than integrated multimodal reasoning. Models further exhibited systematic CC/CE asymmetries absent in humans, suggesting heuristic biases rather than direction-neutral causal abstraction. These findings reveal that large-scale statistical learning does not produce the decontextualized causal schemas underpinning human analogical reasoning, establishing grounding-dependent transfer as a fundamental limitation of current LLMs and VLMs.
Abstract（参考訳）: 抽象的な因果構造を抽出し、それらを新しい状況に適用することは、人間の知性の目印である。大規模言語モデル(LLMs)とビジョン言語モデル(VLMs)は、幅広い推論タスクにおいて強力なパフォーマンスを示しているが、対話型因果学習(Interactive causal learning)の能力は、逐次的な探索とコンテキスト間の移動を通じて遅延構造を誘導する。従来の強化学習(RL)エージェントは破滅的に失敗する。最先端人工知能(AI)モデルが抽象因果構造伝達のための人間ライクなメカニズムを持っているかどうかは、未解決の問題である。ここでは、共通原因(CC)と共通効果(CE)構造を逐次発見する必要があるOpenLockパラダイムを用いて、モデルが根本的な遅延または欠落を示すことを示す。テキストのみの状態では、モデルは人間の発見効率にマッチするか、超えた。対照的に、視覚情報 -- 画像のみとテキストと画像の両方の条件 -- は全般的にパフォーマンスの向上よりも劣化し、統合マルチモーダル推論よりもシンボル処理に大きく依存していた。モデルでは、ヒトに存在しない系統的なCC/CE非対称性が示され、方向ニュートラル因果抽象よりもヒューリスティックバイアスが示唆された。これらの結果から, 大規模統計学習は, ヒトの類推的推論を基盤とした非文脈化因果スキーマを生成せず, 現状のLLMとVLMの基本的な限界として接地依存的移動を確立することが示唆された。

論文の概要: Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

関連論文リスト