Fugu-MT 論文翻訳(概要): Embodied AI: From LLMs to World Models

論文の概要: Embodied AI: From LLMs to World Models

arxiv url: http://arxiv.org/abs/2509.20021v1
Date: Wed, 24 Sep 2025 11:37:48 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-25 20:53:19.796979
Title: Embodied AI: From LLMs to World Models
Title（参考訳）: Embodied AI: LLMから世界モデルへ
Authors: Tongtong Feng, Xin Wang, Yu-Gang Jiang, Wenwu Zhu,
Abstract要約: 人工知能(AI)は、人工知能(AGI)を実現するためのインテリジェントシステムパラダイムである。近年のLarge Language Models(LLMs)とWorld Models(WMs)のブレークスルーは、AIを具現化する上で大きな注目を集めている。
参考スコア（独自算出の注目度）: 65.68972714346909
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Embodied Artificial Intelligence (AI) is an intelligent system paradigm for achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications and driving the evolution from cyberspace to physical systems. Recent breakthroughs in Large Language Models (LLMs) and World Models (WMs) have drawn significant attention for embodied AI. On the one hand, LLMs empower embodied AI via semantic reasoning and task decomposition, bringing high-level natural language instructions and low-level natural language actions into embodied cognition. On the other hand, WMs empower embodied AI by building internal representations and future predictions of the external world, facilitating physical law-compliant embodied interactions. As such, this paper comprehensively explores the literature in embodied AI from basics to advances, covering both LLM driven and WM driven works. In particular, we first present the history, key technologies, key components, and hardware systems of embodied AI, as well as discuss its development via looking from unimodal to multimodal angle. We then scrutinize the two burgeoning fields of embodied AI, i.e., embodied AI with LLMs/multimodal LLMs (MLLMs) and embodied AI with WMs, meticulously delineating their indispensable roles in end-to-end embodied cognition and physical laws-driven embodied interactions. Building upon the above advances, we further share our insights on the necessity of the joint MLLM-WM driven embodied AI architecture, shedding light on its profound significance in enabling complex tasks within physical worlds. In addition, we examine representative applications of embodied AI, demonstrating its wide applicability in real-world scenarios. Last but not least, we point out future research directions of embodied AI that deserve further investigation.
Abstract（参考訳）: Embodied Artificial Intelligence(AI)は、人工知能(AGI)を実現するためのインテリジェントなシステムパラダイムであり、様々な応用の基礎となり、サイバースペースから物理システムへの進化を推進している。近年のLarge Language Models(LLMs)とWorld Models(WMs)のブレークスルーは、AIを具現化する上で大きな注目を集めている。一方、LLMは意味論的推論とタスク分解を通じて、エンボディドAIを増強し、ハイレベルな自然言語命令と低レベルな自然言語アクションをエンボディド認知にもたらす。一方、WMは、外部世界の内的表現と将来の予測を構築し、物理的法に従順なインボディードインタラクションを促進することで、AIのエンパワーメントを高める。そこで本稿では,LLM駆動型とWM駆動型の両方を対象とし,基礎から進歩まで,AIの具体化に関する文献を包括的に調査する。特に、我々はまず、エンボディAIの歴史、鍵技術、キーコンポーネント、ハードウェアシステムを提示し、その開発について、非モーダルからマルチモーダルの角度から見て議論する。次に、LLM/Multimodal LLM(MLLM)によるAIとWMによるAIの具現化と、エンド・ツー・エンドの具現化における不必要な役割の明確化と物理法に基づく具現化の相互作用について検討する。上記の進歩に基づいて、我々はMLLM-WMで駆動される統合型AIアーキテクチャの必要性についての洞察をさらに共有し、物理世界における複雑なタスクの実現において、その重要な重要性に光を当てる。さらに,具体的AIの代表的な応用について検討し,実世界のシナリオに適用可能性を示す。最後に重要なことは、さらなる調査に値する、具体化されたAIの今後の研究方向性を指摘する。

論文の概要: Embodied AI: From LLMs to World Models

関連論文リスト