Related papers: Embodied AI: From LLMs to World Models

Embodied AI: From LLMs to World Models

URL: http://arxiv.org/abs/2509.20021v1
Date: Wed, 24 Sep 2025 11:37:48 GMT
Title: Embodied AI: From LLMs to World Models
Authors: Tongtong Feng, Xin Wang, Yu-Gang Jiang, Wenwu Zhu,
Abstract summary: Embodied Artificial Intelligence (AI) is an intelligent system paradigm for achieving Artificial General Intelligence (AGI)<n>Recent breakthroughs in Large Language Models (LLMs) and World Models (WMs) have drawn significant attention for embodied AI.
Score: 65.68972714346909
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Embodied Artificial Intelligence (AI) is an intelligent system paradigm for achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications and driving the evolution from cyberspace to physical systems. Recent breakthroughs in Large Language Models (LLMs) and World Models (WMs) have drawn significant attention for embodied AI. On the one hand, LLMs empower embodied AI via semantic reasoning and task decomposition, bringing high-level natural language instructions and low-level natural language actions into embodied cognition. On the other hand, WMs empower embodied AI by building internal representations and future predictions of the external world, facilitating physical law-compliant embodied interactions. As such, this paper comprehensively explores the literature in embodied AI from basics to advances, covering both LLM driven and WM driven works. In particular, we first present the history, key technologies, key components, and hardware systems of embodied AI, as well as discuss its development via looking from unimodal to multimodal angle. We then scrutinize the two burgeoning fields of embodied AI, i.e., embodied AI with LLMs/multimodal LLMs (MLLMs) and embodied AI with WMs, meticulously delineating their indispensable roles in end-to-end embodied cognition and physical laws-driven embodied interactions. Building upon the above advances, we further share our insights on the necessity of the joint MLLM-WM driven embodied AI architecture, shedding light on its profound significance in enabling complex tasks within physical worlds. In addition, we examine representative applications of embodied AI, demonstrating its wide applicability in real-world scenarios. Last but not least, we point out future research directions of embodied AI that deserve further investigation.

Related papers

AI Agents and Agentic AI-Navigating a Plethora of Concepts for Future Manufacturing [8.195356684218691]
AI agents are autonomous systems designed to perceive, reason, and act within dynamic environments.<n>LLMs, MLLMs, and Agentic AI contribute to expanding AI's capabilities in information processing, environmental perception, and autonomous decision-making.<n>This study systematically reviews the evolution of AI and AI agent technologies.
arXiv Detail & Related papers (2025-07-02T05:31:17Z)
Multi-agent Embodied AI: Advances and Future Directions [46.23631919950584]
Embodied artificial intelligence (Embodied AI) plays a pivotal role in the application of advanced technologies in the intelligent era.<n>This paper reviews the current state of research, analyzes key contributions, and identifies challenges and future directions.
arXiv Detail & Related papers (2025-05-08T10:13:53Z)
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges [4.180065442680541]
Vision-Language-Action models aim to unify perception, natural language understanding, and embodied action within a single computational framework.<n>This foundational review presents a comprehensive synthesis of recent advancements in Vision-Language-Action models.<n>Key progress areas include architectural innovations, parameter-efficient training strategies, and real-time inference accelerations.
arXiv Detail & Related papers (2025-05-07T19:46:43Z)
A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks [74.52259252807191]
Multimodal Large Language Models (MLLMs) address the complexities of real-world applications far beyond the capabilities of single-modality systems. This paper systematically sorts out the applications of MLLM in multimodal tasks such as natural language, vision, and audio.
arXiv Detail & Related papers (2024-08-02T15:14:53Z)
Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents [55.63497537202751]
Article explores the convergence of connectionist and symbolic artificial intelligence (AI) Traditionally, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic. Recent advancements in large language models (LLMs) highlight the potential of connectionist architectures in handling human language as a form of symbols.
arXiv Detail & Related papers (2024-07-11T14:00:53Z)
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI [116.8199519880327]
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI)<n>In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI.
arXiv Detail & Related papers (2024-07-09T14:14:47Z)
Position Paper: Agent AI Towards a Holistic Intelligence [53.35971598180146]
We emphasize developing Agent AI -- an embodied system that integrates large foundation models into agent actions. In this paper, we propose a novel large action model to achieve embodied intelligent behavior, the Agent Foundation Model.
arXiv Detail & Related papers (2024-02-28T16:09:56Z)
A call for embodied AI [1.7544885995294304]
We propose Embodied AI as the next fundamental step in the pursuit of Artificial General Intelligence. By broadening the scope of Embodied AI, we introduce a theoretical framework based on cognitive architectures. This framework is aligned with Friston's active inference principle, offering a comprehensive approach to EAI development.
arXiv Detail & Related papers (2024-02-06T09:11:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.