Fugu-MT 論文翻訳(概要): Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

論文の概要: Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

arxiv url: http://arxiv.org/abs/2507.00951v1
Date: Tue, 01 Jul 2025 16:52:25 GMT
ステータス: 翻訳完了
システム内更新日: 2025-07-03 14:22:59.740959
Title: Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Title（参考訳）: トークンを超えて考える:脳にインスパイアされた知性から、人工知能のための認知基盤とその社会的影響
Authors: Rizwan Qureshi, Ranjan Sapkota, Abbas Shah, Amgad Muneer, Anas Zafar, Ashmal Vayani, Maged Shoman, Abdelrahman B. M. Eldaly, Kai Zhang, Ferhat Sadak, Shaina Raza, Xinqi Fan, Ravid Shwartz-Ziv, Hong Yan, Vinjia Jain, Aman Chadha, Manoj Karkee, Jia Wu, Philip Torr, Seyedali Mirjalili,
Abstract要約: 本稿では,人工知能,認知神経科学,心理学,生成モデル,エージェントベースシステムの学際的合成について述べる。我々は汎用知能のアーキテクチャと認知の基礎を分析し、モジュラー推論、永続記憶、マルチエージェント協調の役割を強調した。我々は、人工知能への道の鍵となる科学的、技術的、倫理的課題を特定します。
参考スコア（独自算出の注目度）: 31.63205881016299
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Can machines truly think, reason and act in domains like humans? This enduring question continues to shape the pursuit of Artificial General Intelligence (AGI). Despite the growing capabilities of models such as GPT-4.5, DeepSeek, Claude 3.5 Sonnet, Phi-4, and Grok 3, which exhibit multimodal fluency and partial reasoning, these systems remain fundamentally limited by their reliance on token-level prediction and lack of grounded agency. This paper offers a cross-disciplinary synthesis of AGI development, spanning artificial intelligence, cognitive neuroscience, psychology, generative models, and agent-based systems. We analyze the architectural and cognitive foundations of general intelligence, highlighting the role of modular reasoning, persistent memory, and multi-agent coordination. In particular, we emphasize the rise of Agentic RAG frameworks that combine retrieval, planning, and dynamic tool use to enable more adaptive behavior. We discuss generalization strategies, including information compression, test-time adaptation, and training-free methods, as critical pathways toward flexible, domain-agnostic intelligence. Vision-Language Models (VLMs) are reexamined not just as perception modules but as evolving interfaces for embodied understanding and collaborative task completion. We also argue that true intelligence arises not from scale alone but from the integration of memory and reasoning: an orchestration of modular, interactive, and self-improving components where compression enables adaptive behavior. Drawing on advances in neurosymbolic systems, reinforcement learning, and cognitive scaffolding, we explore how recent architectures begin to bridge the gap between statistical learning and goal-directed cognition. Finally, we identify key scientific, technical, and ethical challenges on the path to AGI.
Abstract（参考訳）: 機械は人間のような領域で真に思考し、理性し、行動することができるか? この永続的な疑問は、人工知能(AGI)の追求を形作っている。 GPT-4.5、DeepSeek、Claude 3.5 Sonnet、Phi-4、Grok 3といったマルチモーダル・フラレンシーと部分的推論を示すモデルの能力が増大しているにもかかわらず、これらのシステムはトークンレベルの予測と接地エージェントの欠如によって基本的に制限されている。本稿では、AI、認知神経科学、心理学、生成モデル、エージェントベースシステムにまたがるAGI開発を横断的に合成する。我々は汎用知能のアーキテクチャと認知の基礎を分析し、モジュラー推論、永続記憶、マルチエージェント協調の役割を強調した。特に、より適応的な振る舞いを可能にするために、検索、計画、動的ツールの使用を組み合わせたエージェントRAGフレームワークの台頭を強調します。本稿では,情報圧縮やテスト時間適応,トレーニング不要な手法などの一般化戦略を,柔軟でドメインに依存しないインテリジェンスへの重要な経路として論じる。視覚言語モデル(VLM)は、知覚モジュールとしてだけでなく、具体的理解と協調的なタスク補完のための進化したインターフェースとして再検討される。私たちはまた、真のインテリジェンスは、スケール単独でではなく、メモリと推論の統合によって生じる、と主張する。ニューロシンボリックシステム,強化学習,認知的足場構築の進歩に基づき,近年の建築が,統計的学習と目標指向認知のギャップを埋める方法を探究する。最後に、AGIへの道筋における重要な科学的、技術的、倫理的課題を特定する。

論文の概要: Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

関連論文リスト