"Just in Time" World Modeling Supports Human Planning and Reasoning
- URL: http://arxiv.org/abs/2601.14514v1
- Date: Tue, 20 Jan 2026 22:08:19 GMT
- Title: "Just in Time" World Modeling Supports Human Planning and Reasoning
- Authors: Tony Chen, Sam Cheyette, Kelsey Allen, Joshua Tenenbaum, Kevin Smith,
- Abstract summary: We present a "Just-in-Time" framework for simulation-based reasoning.<n>We show how such representations can be constructed online with minimal added computation.<n>We find strong empirical support for this account over alternative models in a grid-world planning task and a physical reasoning task.
- Score: 1.526051753924211
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Probabilistic mental simulation is thought to play a key role in human reasoning, planning, and prediction, yet the demands of simulation in complex environments exceed realistic human capacity limits. A theory with growing evidence is that people simulate using simplified representations of the environment that abstract away from irrelevant details, but it is unclear how people determine these simplifications efficiently. Here, we present a "Just-in-Time" framework for simulation-based reasoning that demonstrates how such representations can be constructed online with minimal added computation. The model uses a tight interleaving of simulation, visual search, and representation modification, with the current simulation guiding where to look and visual search flagging objects that should be encoded for subsequent simulation. Despite only ever encoding a small subset of objects, the model makes high-utility predictions. We find strong empirical support for this account over alternative models in a grid-world planning task and a physical reasoning task across a range of behavioral measures. Together, these results offer a concrete algorithmic account of how people construct reduced representations to support efficient mental simulation.
Related papers
- PolaRiS: Scalable Real-to-Sim Evaluations for Generalist Robot Policies [88.78188489161028]
We introduce Policy Evaluation and Environment Reconstruction in Simulation (PolaRiS)<n>PolaRiS is a scalable real-to-sim framework for high-fidelity simulated robot evaluation.<n>We show that PolaRiS evaluations provide a much stronger correlation to real world generalist policy performance than existing simulated benchmarks.
arXiv Detail & Related papers (2025-12-18T18:49:41Z) - SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors [58.87134689752605]
We introduce SimBench, the first large-scale, standardized benchmark for a robust, reproducible science of LLM simulation.<n>We show that even the best LLMs today have limited simulation ability (score: 40.80/100), performance scales log-linearly with model size.<n>We demonstrate that simulation ability correlates most strongly with deep, knowledge-intensive reasoning.
arXiv Detail & Related papers (2025-10-20T13:14:38Z) - Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations [61.235500325327585]
Existing AI benchmarks primarily assess verbal reasoning, neglecting the complexities of non-verbal, multi-step visual simulation.<n>We introduce STARE, a benchmark designed to rigorously evaluate multimodal large language models on tasks better solved through visual simulation.<n>Our evaluations show that models excel at reasoning over simpler 2D transformations, but perform close to random chance on more complex tasks.
arXiv Detail & Related papers (2025-06-05T05:09:46Z) - YuLan-OneSim: Towards the Next Generation of Social Simulator with Large Language Models [50.35333054932747]
We introduce a novel social simulator called YuLan-OneSim.<n>Users can simply describe and refine their simulation scenarios through natural language interactions with our simulator.<n>We implement 50 default simulation scenarios spanning 8 domains, including economics, sociology, politics, psychology, organization, demographics, law, and communication.
arXiv Detail & Related papers (2025-05-12T14:05:17Z) - A simulation-heuristics dual-process model for intuitive physics [28.707537312978502]
Using a pouring-marble task, our human study revealed two distinct error patterns when predicting pouring angles, differentiated by simulation time.<n>We propose a dual-process framework, Simulation-Heuristics Model (SHM), where intuitive physics employs simulation for short-time simulation but switches to simulation when simulation becomes costly.<n>The SHM aligns more precisely with human behavior and demonstrates consistent predictive performance across diverse scenarios, advancing our understanding of the adaptive nature of intuitive physical reasoning.
arXiv Detail & Related papers (2025-04-13T12:34:02Z) - GausSim: Foreseeing Reality by Gaussian Simulator for Elastic Objects [55.02281855589641]
GausSim is a novel neural network-based simulator designed to capture the dynamic behaviors of real-world elastic objects represented through Gaussian kernels.<n>We leverage continuum mechanics and treat each kernel as a Center of Mass System (CMS) that represents continuous piece of matter.<n>In addition, GausSim incorporates explicit physics constraints, such as mass and momentum conservation, ensuring interpretable results and robust, physically plausible simulations.
arXiv Detail & Related papers (2024-12-23T18:58:17Z) - A Unified Simulation Framework for Visual and Behavioral Fidelity in
Crowd Analysis [6.460475042590685]
We present a human crowd simulator, called UniCrowd, and its associated validation pipeline.
We show how the simulator can generate annotated data, suitable for computer vision tasks, in particular for detection and segmentation, as well as the related applications, as crowd counting, human pose estimation, trajectory analysis and prediction, and anomaly detection.
arXiv Detail & Related papers (2023-12-05T09:43:27Z) - Informal Safety Guarantees for Simulated Optimizers Through
Extrapolation from Partial Simulations [0.0]
Self-supervised learning is the backbone of state of the art language modeling.
It has been argued that training with predictive loss on a self-supervised dataset causes simulators.
arXiv Detail & Related papers (2023-11-29T09:32:56Z) - Neural Likelihood Approximation for Integer Valued Time Series Data [0.0]
We construct a neural likelihood approximation that can be trained using unconditional simulation of the underlying model.
We demonstrate our method by performing inference on a number of ecological and epidemiological models.
arXiv Detail & Related papers (2023-10-19T07:51:39Z) - Near-realtime Facial Animation by Deep 3D Simulation Super-Resolution [7.14576106770047]
We present a neural network-based simulation framework that can efficiently and realistically enhance a facial performance produced by a low-cost, realtime physics-based simulation.
We use face animation as an exemplar of such a simulation domain, where creating this semantic congruence is achieved by simply dialing in the same muscle actuation controls and skeletal pose in the two simulators.
Our proposed neural network super-resolution framework generalizes from this training set to unseen expressions, compensates for modeling discrepancies between the two simulations due to limited resolution or cost-cutting approximations in the real-time variant, and does not require any semantic descriptors or parameters to
arXiv Detail & Related papers (2023-05-05T00:09:24Z) - CausalCity: Complex Simulations with Agency for Causal Discovery and
Reasoning [68.74447489372037]
We present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning.
A core component of our work is to introduce textitagency, such that it is simple to define and create complex scenarios.
We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment.
arXiv Detail & Related papers (2021-06-25T00:21:41Z) - Deep Bayesian Active Learning for Accelerating Stochastic Simulation [74.58219903138301]
Interactive Neural Process (INP) is a deep active learning framework for simulations and with active learning approaches.
For active learning, we propose a novel acquisition function, Latent Information Gain (LIG), calculated in the latent space of NP based models.
The results demonstrate STNP outperforms the baselines in the learning setting and LIG achieves the state-of-the-art for active learning.
arXiv Detail & Related papers (2021-06-05T01:31:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.