Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems
- URL: http://arxiv.org/abs/2512.08411v1
- Date: Tue, 09 Dec 2025 09:40:34 GMT
- Title: Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems
- Authors: Mingwei Li, Xiaoyuan Zhang, Chengwei Yang, Zilong Zheng, Yaodong Yang,
- Abstract summary: Prismatic World Model (PRISM-WM) is designed to decompose complex hybrid dynamics into composable primitives.<n>PRISM-WM significantly reduces rollout drift by accurately modeling sharp mode transitions in system dynamics.
- Score: 38.4555621948915
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Model-based planning in robotic domains is fundamentally challenged by the hybrid nature of physical dynamics, where continuous motion is punctuated by discrete events such as contacts and impacts. Conventional latent world models typically employ monolithic neural networks that enforce global continuity, inevitably over-smoothing the distinct dynamic modes (e.g., sticking vs. sliding, flight vs. stance). For a planner, this smoothing results in catastrophic compounding errors during long-horizon lookaheads, rendering the search process unreliable at physical boundaries. To address this, we introduce the Prismatic World Model (PRISM-WM), a structured architecture designed to decompose complex hybrid dynamics into composable primitives. PRISM-WM leverages a context-aware Mixture-of-Experts (MoE) framework where a gating mechanism implicitly identifies the current physical mode, and specialized experts predict the associated transition dynamics. We further introduce a latent orthogonalization objective to ensure expert diversity, effectively preventing mode collapse. By accurately modeling the sharp mode transitions in system dynamics, PRISM-WM significantly reduces rollout drift. Extensive experiments on challenging continuous control benchmarks, including high-dimensional humanoids and diverse multi-task settings, demonstrate that PRISM-WM provides a superior high-fidelity substrate for trajectory optimization algorithms (e.g., TD-MPC), proving its potential as a powerful foundational model for next-generation model-based agents.
Related papers
- Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics [51.85385061275941]
Molecular dynamics (MD) simulations remain the gold standard for studying protein dynamics.<n>Recent generative models have shown promise in accelerating simulations, yet they struggle with long-horizon generation.<n>We present STAR-MD, a scalable diffusion model that generates physically plausible protein trajectories over micro-scale timescales.
arXiv Detail & Related papers (2026-02-02T14:13:28Z) - DDP-WM: Disentangled Dynamics Prediction for Efficient World Models [79.53092337527382]
We introduce DDP-WM, a novel world model centered on the principle of Disentangled Dynamics Prediction.<n>DDP-WM realizes this decomposition through an architecture that integrates efficient historical processing with dynamic localization.<n>Experiments demonstrate that DDP-WM achieves significant efficiency and performance across diverse tasks.
arXiv Detail & Related papers (2026-02-02T08:04:25Z) - Aligning Agentic World Models via Knowledgeable Experience Learning [68.85843641222186]
We introduce WorldMind, a framework that constructs a symbolic World Knowledge Repository by synthesizing environmental feedback.<n>WorldMind achieves superior performance compared to baselines with remarkable cross-model and cross-environment transferability.
arXiv Detail & Related papers (2026-01-19T17:33:31Z) - TeleWorld: Towards Dynamic Multimodal Synthesis with a 4D World Model [53.555353366322464]
We present TeleWorld, a real-time multimodal 4D world modeling framework that unifies video generation, dynamic scene reconstruction, and long-term world memory within a closed-loop system.<n>Our approach achieves seamless integration of dynamic object modeling and static scene representation within a unified 4D framework, advancing world models toward practical, interactive, and computationally accessible synthesis systems.
arXiv Detail & Related papers (2025-12-31T18:31:46Z) - Benchmarking neural surrogates on realistic spatiotemporal multiphysics flows [18.240532888032394]
We present REALM (REalistic AI Learning for Multiphysics), a rigorous benchmarking framework designed to test neural surrogates on challenging, application-driven reactive flows.<n>We benchmark over a dozen representative surrogate model families, including spectral operators, convolutional models, Transformers, pointwise operators, and graph/mesh networks.<n>We identify three robust trends: (i) a scaling barrier governed jointly by dimensionality, stiffness, and mesh irregularity, leading to rapidly growing rollout errors; (ii) performance primarily controlled by architectural inductive biases rather than parameter count; and (iii) a persistent gap between nominal accuracy metrics and physically
arXiv Detail & Related papers (2025-12-21T05:04:13Z) - Model-Based Diffusion Sampling for Predictive Control in Offline Decision Making [48.998030470623384]
offline decision-making requires reliable behaviors from fixed datasets without further interaction.<n>We propose a compositional model-based diffusion framework consisting of: (i) a planner that generates diverse, task-aligned trajectories; (ii) a dynamics model that enforces consistency with the underlying system dynamics; and (iii) a ranker module that selects behaviors aligned with the task objectives.
arXiv Detail & Related papers (2025-12-09T06:26:02Z) - MODE: Learning compositional representations of complex systems with Mixtures Of Dynamical Experts [5.250743580183822]
MODE is a graphical modeling framework whose neural gating mechanism decomposes complex dynamics into sparse, interpretable components.<n>We show how MODE succeeds on challenging forecasting tasks which simulate key cycling and branching processes in cell biology.
arXiv Detail & Related papers (2025-10-10T17:52:31Z) - Multi-modal Spatio-Temporal Transformer for High-resolution Land Subsidence Prediction [3.3295066998131637]
We propose a novel framework that fuses dynamic displacement data with static physical priors.<n>On the public EGMS dataset, MM-STT establishes a new state-of-the-art, reducing the long-range forecast RMSE by an order of high magnitude.
arXiv Detail & Related papers (2025-09-29T18:49:04Z) - SAMPO:Scale-wise Autoregression with Motion PrOmpt for generative world models [42.814012901180774]
textbfSAMPO is a hybrid framework that combines visual autoregressive modeling for intra-frame generation with causal modeling for next-frame generation.<n>We show that SAMPO achieves competitive performance in action-conditioned video prediction and model-based control.<n>We also evaluate SAMPO's zero-shot generalization and scaling behavior, demonstrating its ability to generalize to unseen tasks.
arXiv Detail & Related papers (2025-09-19T02:41:37Z) - Kuramoto Orientation Diffusion Models [67.0711709825854]
Orientation-rich images, such as fingerprints and textures, often exhibit coherent angular patterns.<n>Motivated by the role of phase synchronization in biological systems, we propose a score-based generative model.<n>We implement competitive results on general image benchmarks and significantly improves generation quality on orientation-dense datasets like fingerprints and textures.
arXiv Detail & Related papers (2025-09-18T18:18:49Z) - Towards Efficient General Feature Prediction in Masked Skeleton Modeling [59.46799426434277]
We propose a novel General Feature Prediction framework (GFP) for efficient mask skeleton modeling.<n>Our key innovation is replacing conventional low-level reconstruction with high-level feature prediction that spans from local motion patterns to global semantic representations.
arXiv Detail & Related papers (2025-09-03T18:05:02Z) - Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models [57.45019514036948]
Simultaneous MRMP Diffusion (SMD) is a novel approach integrating constrained optimization into the diffusion sampling process to produce collision-free, kinematically feasible trajectories.<n>The paper introduces a comprehensive MRMP benchmark to evaluate trajectory planning algorithms across scenarios with varying robot densities, obstacle complexities, and motion constraints.
arXiv Detail & Related papers (2025-02-05T20:51:28Z) - Evolve Smoothly, Fit Consistently: Learning Smooth Latent Dynamics For
Advection-Dominated Systems [14.553972457854517]
We present a data-driven, space-time continuous framework to learn surrogatemodels for complex physical systems.
We leverage the expressive power of the network and aspecially designed consistency-inducing regularization to obtain latent trajectories that are both low-dimensional and smooth.
arXiv Detail & Related papers (2023-01-25T03:06:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.