Understanding World or Predicting Future? A Comprehensive Survey of World Models
- URL: http://arxiv.org/abs/2411.14499v1
- Date: Thu, 21 Nov 2024 03:58:50 GMT
- Title: Understanding World or Predicting Future? A Comprehensive Survey of World Models
- Authors: Jingtao Ding, Yunke Zhang, Yu Shang, Yuheng Zhang, Zefang Zong, Jie Feng, Yuan Yuan, Hongyuan Su, Nian Li, Nicholas Sukiennik, Fengli Xu, Yong Li,
- Abstract summary: This survey offers a comprehensive review of the literature on world models.
World models are regarded as tools for either understanding the present state of the world or predicting its future dynamics.
We explore the application of world models in key domains, including autonomous driving, robotics, and social simulacra.
- Score: 21.96900555014452
- License:
- Abstract: The concept of world models has garnered significant attention due to advancements in multimodal large language models such as GPT-4 and video generation models such as Sora, which are central to the pursuit of artificial general intelligence. This survey offers a comprehensive review of the literature on world models. Generally, world models are regarded as tools for either understanding the present state of the world or predicting its future dynamics. This review presents a systematic categorization of world models, emphasizing two primary functions: (1) constructing internal representations to understand the mechanisms of the world, and (2) predicting future states to simulate and guide decision-making. Initially, we examine the current progress in these two categories. We then explore the application of world models in key domains, including autonomous driving, robotics, and social simulacra, with a focus on how each domain utilizes these aspects. Finally, we outline key challenges and provide insights into potential future research directions.
Related papers
- Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey [61.39993881402787]
World models and video generation are pivotal technologies in the domain of autonomous driving.
This paper investigates the relationship between these two technologies.
By analyzing the interplay between video generation and world models, this survey identifies critical challenges and future research directions.
arXiv Detail & Related papers (2024-11-05T08:58:35Z) - Making Large Language Models into World Models with Precondition and Effect Knowledge [1.8561812622368763]
We show that Large Language Models (LLMs) can be induced to perform two critical world model functions.
We validate that the precondition and effect knowledge generated by our models aligns with human understanding of world dynamics.
arXiv Detail & Related papers (2024-09-18T19:28:04Z) - Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond [101.15395503285804]
General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI)
In this survey, we embark on a comprehensive exploration of the latest advancements in world models.
We examine challenges and limitations of world models, and discuss their potential future directions.
arXiv Detail & Related papers (2024-05-06T14:37:07Z) - Learning World Models With Hierarchical Temporal Abstractions: A Probabilistic Perspective [2.61072980439312]
Devising formalisms to develop internal world models is a critical research challenge in the domains of artificial intelligence and machine learning.
This thesis identifies several limitations with the prevalent use of state space models as internal world models.
The structure of models in formalisms facilitates exact probabilistic inference using belief propagation, as well as end-to-end learning via backpropagation through time.
These formalisms integrate the concept of uncertainty in world states, thus improving the system's capacity to emulate the nature of the real world and quantify the confidence in its predictions.
arXiv Detail & Related papers (2024-04-24T12:41:04Z) - World Models for Autonomous Driving: An Initial Survey [16.448614804069674]
The capability to accurately predict future events and assess their implications is paramount for both safety and efficiency.
World models have emerged as a transformative approach, enabling autonomous driving systems to synthesize and interpret vast amounts of sensor data.
This paper provides an initial review of the current state and prospective advancements of world models in autonomous driving.
arXiv Detail & Related papers (2024-03-05T03:23:55Z) - Open-world Machine Learning: A Review and New Outlooks [83.6401132743407]
This paper aims to provide a comprehensive introduction to the emerging open-world machine learning paradigm.
It aims to help researchers build more powerful AI systems in their respective fields, and to promote the development of artificial general intelligence.
arXiv Detail & Related papers (2024-03-04T06:25:26Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Foundation Models for Decision Making: Problems, Methods, and
Opportunities [124.79381732197649]
Foundation models pretrained on diverse data at scale have demonstrated extraordinary capabilities in a wide range of vision and language tasks.
New paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning.
Research at the intersection of foundation models and decision making holds tremendous promise for creating powerful new systems.
arXiv Detail & Related papers (2023-03-07T18:44:07Z) - Predictive World Models from Real-World Partial Observations [66.80340484148931]
We present a framework for learning a probabilistic predictive world model for real-world road environments.
While prior methods require complete states as ground truth for learning, we present a novel sequential training method to allow HVAEs to learn to predict complete states from partially observed states only.
arXiv Detail & Related papers (2023-01-12T02:07:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.