Related papers: Natural Building Blocks for Structured World Models: Theory, Evidence, and Scaling

Natural Building Blocks for Structured World Models: Theory, Evidence, and Scaling

URL: http://arxiv.org/abs/2511.02091v1
Date: Mon, 03 Nov 2025 22:02:04 GMT
Title: Natural Building Blocks for Structured World Models: Theory, Evidence, and Scaling
Authors: Lancelot Da Costa, Sanjeev Namjoshi, Mohammed Abbas Ansari, Bernhard Schölkopf,
Abstract summary: We propose a framework that specifies the natural building blocks for structured world models.<n>We examine Hidden Markov Models (HMMs) and linear switching dynamical systems (sLDS) as natural building blocks for discrete and continuous modeling.<n>This modular approach supports both passive modeling (generation, forecasting) and active control (planning, decision-making) within the same architecture.
Score: 42.78591555984395
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The field of world modeling is fragmented, with researchers developing bespoke architectures that rarely build upon each other. We propose a framework that specifies the natural building blocks for structured world models based on the fundamental stochastic processes that any world model must capture: discrete processes (logic, symbols) and continuous processes (physics, dynamics); the world model is then defined by the hierarchical composition of these building blocks. We examine Hidden Markov Models (HMMs) and switching linear dynamical systems (sLDS) as natural building blocks for discrete and continuous modeling--which become partially-observable Markov decision processes (POMDPs) and controlled sLDS when augmented with actions. This modular approach supports both passive modeling (generation, forecasting) and active control (planning, decision-making) within the same architecture. We avoid the combinatorial explosion of traditional structure learning by largely fixing the causal architecture and searching over only four depth parameters. We review practical expressiveness through multimodal generative modeling (passive) and planning from pixels (active), with performance competitive to neural approaches while maintaining interpretability. The core outstanding challenge is scalable joint structure-parameter learning; current methods finesse this by cleverly growing structure and parameters incrementally, but are limited in their scalability. If solved, these natural building blocks could provide foundational infrastructure for world modeling, analogous to how standardized layers enabled progress in deep learning.

Related papers

The Trinity of Consistency as a Defining Principle for General World Models [106.16462830681452]
General World Models are capable of learning, simulating, and reasoning about objective physical laws.<n>We propose a principled theoretical framework that defines the essential properties requisite for a General World Model.<n>Our work establishes a principled pathway toward general world models, clarifying both the limitations of current systems and the architectural requirements for future progress.
arXiv Detail & Related papers (2026-02-26T16:15:55Z)
Geometric Priors for Generalizable World Models via Vector Symbolic Architecture [9.216794073296679]
Key challenge in artificial intelligence is understanding how neural systems learn representations that capture the underlying dynamics of the world.<n>We address these issues with a generalizable world model grounded in Vector Symbolic Architecture (VSA) principles as geometric priors.<n>We show how training to have latent group structure yields generalizable, data-efficient, and interpretable world models.
arXiv Detail & Related papers (2026-02-25T00:41:42Z)
ExpertWeaver: Unlocking the Inherent MoE in Dense LLMs with GLU Activation Patterns [68.61814799047956]
Mixture-of-Experts (MoE) effectively scales model capacity while preserving computational efficiency through sparse expert activation.<n>We introduce ExpertWeaver, a training-free framework that partitions neurons according to their activation patterns and constructs shared experts and specialized routed experts with layer-adaptive configurations.
arXiv Detail & Related papers (2026-02-17T11:50:58Z)
Top 10 Open Challenges Steering the Future of Diffusion Language Model and Its Variants [85.33837131101342]
We propose a strategic roadmap organized into four pillars: foundational infrastructure, algorithmic optimization, cognitive reasoning, and unified multimodal intelligence.<n>We argue that this transition is essential for developing next-generation AI capable of complex structural reasoning, dynamic self-correction, and seamless multimodal integration.
arXiv Detail & Related papers (2026-01-20T14:58:23Z)
Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning [58.533203990515034]
Scaling neural networks has driven breakthrough advances in machine learning, yet this paradigm fails in deep reinforcement learning (DRL)<n>We show that dynamic sparse training strategies provide module-specific benefits that complement the primary scalability foundation established by architectural improvements.<n>We finally distill these insights into Module-Specific Training (MST), a practical framework that exploits the benefits of architectural improvements and demonstrates substantial scalability gains across diverse RL algorithms without algorithmic modifications.
arXiv Detail & Related papers (2025-10-14T03:03:08Z)
Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate [1.0152838128195467]
The prevailing paradigm for scaling large language models (LLMs) involves monolithic, end-to-end training.<n>This paper explores an alternative, constructive scaling paradigm, enabled by the principle of emergent semantics in Transformers.<n>We operationalize this with a layer-wise constructive methodology that combines strict layer freezing in early stages with efficient, holistic fine-tuning of the entire model stack.
arXiv Detail & Related papers (2025-07-08T20:01:15Z)
Information-Theoretic Framework for Understanding Modern Machine-Learning [4.435094091999926]
We present an information-theoretic framework that views learning as universal prediction under log loss.<n>We argue that successful architectures possess a broad complexity range, enabling learning in highly over- parameterized model classes.<n>The framework sheds light on the role of inductive biases, the effectiveness of descent gradient, and phenomena such as flat minima.
arXiv Detail & Related papers (2025-06-09T11:32:31Z)
Learning to Walk from Three Minutes of Real-World Data with Semi-structured Dynamics Models [9.318262213262866]
We introduce a novel framework for learning semi-structured dynamics models for contact-rich systems. We make accurate long-horizon predictions with substantially less data than prior methods. We validate our approach on a real-world Unitree Go1 quadruped robot.
arXiv Detail & Related papers (2024-10-11T18:11:21Z)
Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial Complexes [4.787059527893628]
We propose a novel architecture designed to operate with simplicial complexes, utilizing the Mamba state-space model as its backbone. Our approach generates sequences for the nodes based on the neighboring cells, enabling direct communication between all higher-order structures, regardless of their rank.
arXiv Detail & Related papers (2024-09-18T14:49:25Z)
Configurable Foundation Models: Building LLMs from a Modular Perspective [115.63847606634268]
A growing tendency to decompose LLMs into numerous functional modules allows for inference with part of modules and dynamic assembly of modules to tackle complex tasks. We coin the term brick to represent each functional module, designating the modularized structure as customizable foundation models. We present four brick-oriented operations: retrieval and routing, merging, updating, and growing. We find that the FFN layers follow modular patterns with functional specialization of neurons and functional neuron partitions.
arXiv Detail & Related papers (2024-09-04T17:01:02Z)
Latent Traversals in Generative Models as Potential Flows [113.4232528843775]
We propose to model latent structures with a learned dynamic potential landscape. Inspired by physics, optimal transport, and neuroscience, these potential landscapes are learned as physically realistic partial differential equations. Our method achieves both more qualitatively and quantitatively disentangled trajectories than state-of-the-art baselines.
arXiv Detail & Related papers (2023-04-25T15:53:45Z)
S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures. We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.