Discovering High Level Patterns from Simulation Traces
- URL: http://arxiv.org/abs/2602.10009v1
- Date: Tue, 10 Feb 2026 17:31:39 GMT
- Title: Discovering High Level Patterns from Simulation Traces
- Authors: Sean Memery, Kartic Subr,
- Abstract summary: We propose a natural language guided method to discover coarse-grained patterns from detailed simulation logs.<n>Specifically, we synthesize programs that operate on simulation logs and map them to a series of high level activated patterns.<n>We show, through two physics benchmarks, that this annotated representation of the simulation log is more amenable to natural language reasoning about physical systems.
- Score: 1.8964402635820152
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence (AI) agents embedded in environments with physics-based interaction face many challenges including reasoning, planning, summarization, and question answering. This problem is exacerbated when a human user wishes to either guide or interact with the agent in natural language. Although the use of Language Models (LMs) is the default choice, as an AI tool, they struggle with tasks involving physics. The LM's capability for physical reasoning is learned from observational data, rather than being grounded in simulation. A common approach is to include simulation traces as context, but this suffers from poor scalability as simulation traces contain larger volumes of fine-grained numerical and semantic data. In this paper, we propose a natural language guided method to discover coarse-grained patterns (e.g., 'rigid-body collision', 'stable support', etc.) from detailed simulation logs. Specifically, we synthesize programs that operate on simulation logs and map them to a series of high level activated patterns. We show, through two physics benchmarks, that this annotated representation of the simulation log is more amenable to natural language reasoning about physical systems. We demonstrate how this method enables LMs to generate effective reward programs from goals specified in natural language, which may be used within the context of planning or supervised learning.
Related papers
- SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models [60.80050275581661]
Vision-Language Models (VLMs) exhibit remarkable common-sense and semantic reasoning capabilities.<n>They lack a grounded understanding of physical dynamics.<n>We present S, a test-time, SIMulation-enabled ACTion Planning framework.<n>Our method demonstrates state-of-the-art performance on five challenging, real-world rigid-body and deformable manipulation tasks.
arXiv Detail & Related papers (2025-12-05T18:51:03Z) - SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors [58.87134689752605]
We introduce SimBench, the first large-scale, standardized benchmark for a robust, reproducible science of LLM simulation.<n>We show that even the best LLMs today have limited simulation ability (score: 40.80/100), performance scales log-linearly with model size.<n>We demonstrate that simulation ability correlates most strongly with deep, knowledge-intensive reasoning.
arXiv Detail & Related papers (2025-10-20T13:14:38Z) - MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science [62.96434290874878]
Current Multi-Modal Large Language Models (MLLM) have shown strong capabilities in general visual reasoning tasks.<n>We develop a new framework, named Multi-Modal Scientific Reasoning with Physics Perception and Simulation (MAPS) based on an MLLM.<n>MAPS decomposes expert-level multi-modal reasoning task into physical diagram understanding via a Physical Perception Model (PPM) and reasoning with physical knowledge via a simulator.
arXiv Detail & Related papers (2025-01-18T13:54:00Z) - Physics Context Builders: A Modular Framework for Physical Reasoning in Vision-Language Models [11.282655911647483]
Physical reasoning remains a significant challenge for Vision-Language Models (VLMs)<n>We introduce Physics Context Builders (PCBs), a modular framework where specialized smaller VLMs are fine-tuned to generate detailed physical scene descriptions.<n>PCBs enable the separation of visual perception from reasoning, allowing us to analyze their relative contributions to physical understanding.
arXiv Detail & Related papers (2024-12-11T18:40:16Z) - Conversational Code Generation: a Case Study of Designing a Dialogue System for Generating Driving Scenarios for Testing Autonomous Vehicles [14.711419284809496]
We design a natural language interface to assist a non-coding domain expert in synthesising the desired scenarios and vehicle behaviours.<n>We show that using it to convert utterances to the symbolic program is feasible, despite the very small training dataset.<n>Human experiments show that dialogue is critical to successful simulation generation, leading to a 4.5 times higher success rate than a generation without engaging in extended conversation.
arXiv Detail & Related papers (2024-10-13T13:07:31Z) - FactorSim: Generative Simulation via Factorized Representation [14.849320460718591]
We introduce FACTORSIM that generates full simulations in code from language input that can be used to train agents.
For evaluation, we introduce a generative simulation benchmark that assesses the generated simulation code's accuracy and effectiveness in facilitating zero-shot transfers in reinforcement learning settings.
We show that FACTORSIM outperforms existing methods in generating simulations regarding prompt alignment (e.g., accuracy), zero-shot transfer abilities, and human evaluation.
arXiv Detail & Related papers (2024-09-26T09:00:30Z) - BeSimulator: A Large Language Model Powered Text-based Behavior Simulator [18.318419980796012]
We propose BeSimulator as an attempt towards behavior simulation in the context of text-based environments.<n>BeSimulator can generalize across scenarios and achieve long-horizon complex simulation.<n>Our experiments show a significant performance improvement in behavior simulation compared to baselines.
arXiv Detail & Related papers (2024-09-24T08:37:04Z) - SimLM: Can Language Models Infer Parameters of Physical Systems? [56.38608628187024]
We investigate the performance of Large Language Models (LLMs) at performing parameter inference in the context of physical systems.
Our experiments suggest that they are not inherently suited to this task, even for simple systems.
We propose a promising direction of exploration, which involves the use of physical simulators to augment the context of LLMs.
arXiv Detail & Related papers (2023-12-21T12:05:19Z) - MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning [63.80739044622555]
We introduce MuSR, a dataset for evaluating language models on soft reasoning tasks specified in a natural language narrative.
This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm.
Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning.
arXiv Detail & Related papers (2023-10-24T17:59:20Z) - User Behavior Simulation with Large Language Model based Agents [116.74368915420065]
We propose an LLM-based agent framework and design a sandbox environment to simulate real user behaviors.
Based on extensive experiments, we find that the simulated behaviors of our method are very close to the ones of real humans.
arXiv Detail & Related papers (2023-06-05T02:58:35Z) - Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings.
We demonstrate that this framework enables effective generalization across different environments.
For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z) - Automated Dissipation Control for Turbulence Simulation with Shell
Models [1.675857332621569]
The application of machine learning (ML) techniques, especially neural networks, has seen tremendous success at processing images and language.
In this work we construct a strongly simplified representation of turbulence by using the Gledzer-Ohkitani-Yamada shell model.
We propose an approach that aims to reconstruct statistical properties of turbulence such as the self-similar inertial-range scaling.
arXiv Detail & Related papers (2022-01-07T15:03:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.