Embodied AI Agents: Modeling the World
- URL: http://arxiv.org/abs/2506.22355v3
- Date: Mon, 07 Jul 2025 15:42:11 GMT
- Title: Embodied AI Agents: Modeling the World
- Authors: Pascale Fung, Yoram Bachrach, Asli Celikyilmaz, Kamalika Chaudhuri, Delong Chen, Willy Chung, Emmanuel Dupoux, Hongyu Gong, Hervé Jégou, Alessandro Lazaric, Arjun Majumdar, Andrea Madotto, Franziska Meier, Florian Metze, Louis-Philippe Morency, Théo Moutakanni, Juan Pino, Basile Terver, Joseph Tighe, Paden Tomasello, Jitendra Malik,
- Abstract summary: This paper describes our research on AI agents embodied in visual, virtual or physical forms.<n>We propose that the development of world models is central to reasoning and planning of embodied AI agents.<n>We also propose to learn the mental world model of users to enable better human-agent collaboration.
- Score: 188.85697524284834
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes our research on AI agents embodied in visual, virtual or physical forms, enabling them to interact with both users and their environments. These agents, which include virtual avatars, wearable devices, and robots, are designed to perceive, learn and act within their surroundings, which makes them more similar to how humans learn and interact with the environments as compared to disembodied agents. We propose that the development of world models is central to reasoning and planning of embodied AI agents, allowing these agents to understand and predict their environment, to understand user intentions and social contexts, thereby enhancing their ability to perform complex tasks autonomously. World modeling encompasses the integration of multimodal perception, planning through reasoning for action and control, and memory to create a comprehensive understanding of the physical world. Beyond the physical world, we also propose to learn the mental world model of users to enable better human-agent collaboration.
Related papers
- Neural Brain: A Neuroscience-inspired Framework for Embodied Agents [58.58177409853298]
Current AI systems, such as large language models, remain disembodied, unable to physically engage with the world.<n>At the core of this challenge lies the concept of Neural Brain, a central intelligence system designed to drive embodied agents with human-like adaptability.<n>This paper introduces a unified framework for the Neural Brain of embodied agents, addressing two fundamental challenges.
arXiv Detail & Related papers (2025-05-12T15:05:34Z) - Autonomous Embodied Agents: When Robotics Meets Deep Learning Reasoning [0.9790236766474201]
This dissertation follows the complete creation process of embodied agents for indoor environments.<n>We aim to contribute to research in Embodied AI and autonomous agents, in order to foster future work in this field.
arXiv Detail & Related papers (2025-05-02T00:43:28Z) - Heads Up eXperience (HUX): Always-On AI Companion for Human Computer Environment Interaction [0.5825410941577593]
Heads Up eXperience (HUX) is an AI system designed to bridge the gap between digital and human environments.
By tracking the user's eye gaze, analyzing the surrounding environment, and interpreting verbal contexts, the system captures and enhances multi-modal data.
Intended for deployment in smart glasses and extended reality headsets, HUX AI aims to become a personal and useful AI companion for daily life.
arXiv Detail & Related papers (2024-07-28T13:15:51Z) - V-IRL: Grounding Virtual Intelligence in Real Life [65.87750250364411]
V-IRL is a platform that enables agents to interact with the real world in a virtual yet realistic environment.
Our platform serves as a playground for developing agents that can accomplish various practical tasks.
arXiv Detail & Related papers (2024-02-05T18:59:36Z) - On the Emergence of Symmetrical Reality [51.21203247240322]
We introduce the symmetrical reality framework, which offers a unified representation encompassing various forms of physical-virtual amalgamations.
We propose an instance of an AI-driven active assistance service that illustrates the potential applications of symmetrical reality.
arXiv Detail & Related papers (2024-01-26T16:09:39Z) - Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data.
We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z) - World Models and Predictive Coding for Cognitive and Developmental
Robotics: Frontiers and Challenges [51.92834011423463]
We focus on the two concepts of world models and predictive coding.
In neuroscience, predictive coding proposes that the brain continuously predicts its inputs and adapts to model its own dynamics and control behavior in its environment.
arXiv Detail & Related papers (2023-01-14T06:38:14Z) - Imitating Interactive Intelligence [24.95842455898523]
We study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment.
To build agents that can robustly interact with humans, we would ideally train them while they interact with humans.
We use ideas from inverse reinforcement learning to reduce the disparities between human-human and agent-agent interactive behaviour.
arXiv Detail & Related papers (2020-12-10T13:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.