Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
- URL: http://arxiv.org/abs/2405.03520v1
- Date: Mon, 6 May 2024 14:37:07 GMT
- Title: Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
- Authors: Zheng Zhu, Xiaofeng Wang, Wangbo Zhao, Chen Min, Nianchen Deng, Min Dou, Yuqi Wang, Botian Shi, Kai Wang, Chi Zhang, Yang You, Zhaoxiang Zhang, Dawei Zhao, Liang Xiao, Jian Zhao, Jiwen Lu, Guan Huang,
- Abstract summary: General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI)
In this survey, we embark on a comprehensive exploration of the latest advancements in world models.
We examine challenges and limitations of world models, and discuss their potential future directions.
- Score: 101.15395503285804
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual environments to decision-making systems. Recently, the emergence of the Sora model has attained significant attention due to its remarkable simulation capabilities, which exhibits an incipient comprehension of physical laws. In this survey, we embark on a comprehensive exploration of the latest advancements in world models. Our analysis navigates through the forefront of generative methodologies in video generation, where world models stand as pivotal constructs facilitating the synthesis of highly realistic visual content. Additionally, we scrutinize the burgeoning field of autonomous-driving world models, meticulously delineating their indispensable role in reshaping transportation and urban mobility. Furthermore, we delve into the intricacies inherent in world models deployed within autonomous agents, shedding light on their profound significance in enabling intelligent interactions within dynamic environmental contexts. At last, we examine challenges and limitations of world models, and discuss their potential future directions. We hope this survey can serve as a foundational reference for the research community and inspire continued innovation. This survey will be regularly updated at: https://github.com/GigaAI-research/General-World-Models-Survey.
Related papers
- Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI [95.96983812740683]
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial Intelligence (AGI)
MLMs andWMs have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilities.
In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI.
arXiv Detail & Related papers (2024-07-09T14:14:47Z) - World Models for Autonomous Driving: An Initial Survey [16.448614804069674]
The capability to accurately predict future events and assess their implications is paramount for both safety and efficiency.
World models have emerged as a transformative approach, enabling autonomous driving systems to synthesize and interpret vast amounts of sensor data.
This paper provides an initial review of the current state and prospective advancements of world models in autonomous driving.
arXiv Detail & Related papers (2024-03-05T03:23:55Z) - Open-world Machine Learning: A Review and New Outlooks [83.6401132743407]
This paper aims to provide a comprehensive introduction to the emerging open-world machine learning paradigm.
It aims to help researchers build more powerful AI systems in their respective fields, and to promote the development of artificial general intelligence.
arXiv Detail & Related papers (2024-03-04T06:25:26Z) - WorldDreamer: Towards General World Models for Video Generation via
Predicting Masked Tokens [75.02160668328425]
We introduce WorldDreamer, a pioneering world model to foster a comprehensive comprehension of general world physics and motions.
WorldDreamer frames world modeling as an unsupervised visual sequence modeling challenge.
Our experiments show that WorldDreamer excels in generating videos across different scenarios, including natural scenes and driving environments.
arXiv Detail & Related papers (2024-01-18T14:01:20Z) - Neural World Models for Computer Vision [2.741266294612776]
We present a framework to train a world model and a policy, parameterised by deep neural networks.
We leverage important computer vision concepts such as geometry, semantics, and motion to scale world models to complex urban driving scenes.
Our model can jointly predict static scene, dynamic scene, and ego-behaviour in an urban driving environment.
arXiv Detail & Related papers (2023-06-15T14:58:21Z) - Real-World Humanoid Locomotion with Reinforcement Learning [92.85934954371099]
We present a fully learning-based approach for real-world humanoid locomotion.
Our controller can walk over various outdoor terrains, is robust to external disturbances, and can adapt in context.
arXiv Detail & Related papers (2023-03-06T18:59:09Z) - Predictive World Models from Real-World Partial Observations [66.80340484148931]
We present a framework for learning a probabilistic predictive world model for real-world road environments.
While prior methods require complete states as ground truth for learning, we present a novel sequential training method to allow HVAEs to learn to predict complete states from partially observed states only.
arXiv Detail & Related papers (2023-01-12T02:07:26Z) - Active World Model Learning with Progress Curiosity [12.077052764803163]
World models are self-supervised predictive models of how the world evolves.
In this work, we study how to design such a curiosity-driven Active World Model Learning system.
We propose an AWML system driven by $gamma$-Progress: a scalable and effective learning progress-based curiosity signal.
arXiv Detail & Related papers (2020-07-15T17:19:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.