Simulation Streams: A Programming Paradigm for Controlling Large Language Models and Building Complex Systems with Generative AI
- URL: http://arxiv.org/abs/2501.18668v1
- Date: Thu, 30 Jan 2025 16:38:03 GMT
- Title: Simulation Streams: A Programming Paradigm for Controlling Large Language Models and Building Complex Systems with Generative AI
- Authors: Peter Sunehag, Joel Z. Leibo,
- Abstract summary: Simulation Streams is a programming paradigm designed to efficiently control and leverage Large Language Models (LLMs)
Our primary goal is to create a framework that harnesses the agentic abilities of LLMs while addressing their limitations in maintaining consistency.
- Score: 3.3126968968429407
- License:
- Abstract: We introduce Simulation Streams, a programming paradigm designed to efficiently control and leverage Large Language Models (LLMs) for complex, dynamic simulations and agentic workflows. Our primary goal is to create a minimally interfering framework that harnesses the agentic abilities of LLMs while addressing their limitations in maintaining consistency, selectively ignoring/including information, and enforcing strict world rules. Simulation Streams achieves this through a state-based approach where variables are modified in sequential steps by "operators," producing output on a recurring format and adhering to consistent rules for state variables. This approach focus the LLMs on defined tasks, while aiming to have the context stream remain "in-distribution". The approach incorporates an Entity-Component-System (ECS) architecture to write programs in a more intuitive manner, facilitating reuse of workflows across different components and entities. This ECS approach enhances the modularity of the output stream, allowing for complex, multi-entity simulations while maintaining format consistency, information control, and rule enforcement. It is supported by a custom editor that aids in creating, running, and analyzing simulations. We demonstrate the versatility of simulation streams through an illustrative example of an ongoing market economy simulation, a social simulation of three characters playing a game of catch in a park and a suite of classical reinforcement learning benchmark tasks. These examples showcase Simulation Streams' ability to handle complex, evolving scenarios over 100s-1000s of iterations, facilitate comparisons between different agent workflows and models, and maintain consistency and continued interesting developments in LLM-driven simulations.
Related papers
- Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.
We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.
We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z) - CaLMFlow: Volterra Flow Matching using Causal Language Models [14.035963716966787]
CaLMFlow is a framework that casts flow matching as a Volterra integral equation (VIE)
Our method implements tokenization across space and time, thereby solving a VIE over these domains.
We demonstrate CaLMFlow's effectiveness on synthetic and real-world data, including single-cell perturbation response prediction.
arXiv Detail & Related papers (2024-10-03T05:07:41Z) - FactorSim: Generative Simulation via Factorized Representation [14.849320460718591]
We introduce FACTORSIM that generates full simulations in code from language input that can be used to train agents.
For evaluation, we introduce a generative simulation benchmark that assesses the generated simulation code's accuracy and effectiveness in facilitating zero-shot transfers in reinforcement learning settings.
We show that FACTORSIM outperforms existing methods in generating simulations regarding prompt alignment (e.g., accuracy), zero-shot transfer abilities, and human evaluation.
arXiv Detail & Related papers (2024-09-26T09:00:30Z) - LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments [70.91258869156353]
We introduce LangSuitE, a versatile and simulation-free testbed featuring 6 representative embodied tasks in textual embodied worlds.
Compared with previous LLM-based testbeds, LangSuitE offers adaptability to diverse environments without multiple simulation engines.
We devise a novel chain-of-thought (CoT) schema, EmMem, which summarizes embodied states w.r.t. history information.
arXiv Detail & Related papers (2024-06-24T03:36:29Z) - Model Composition for Multimodal Large Language Models [71.5729418523411]
We propose a new paradigm through the model composition of existing MLLMs to create a new model that retains the modal understanding capabilities of each original model.
Our basic implementation, NaiveMC, demonstrates the effectiveness of this paradigm by reusing modality encoders and merging LLM parameters.
arXiv Detail & Related papers (2024-02-20T06:38:10Z) - SymbolicAI: A framework for logic-based approaches combining generative models and solvers [9.841285581456722]
We introduce SymbolicAI, a versatile and modular framework employing a logic-based approach to concept learning and flow management in generative processes.
We treat large language models (LLMs) as semantic solvers that execute tasks based on both natural and formal language instructions.
arXiv Detail & Related papers (2024-02-01T18:50:50Z) - Code Simulation Challenges for Large Language Models [6.970495767499435]
This work studies to what extent Large Language Models (LLMs) can simulate coding and algorithmic tasks.
We introduce benchmarks for straight-line programs, code that contains critical paths, and approximate and redundant instructions.
We propose a novel off-the-shelf prompting method, Chain of Simulation (CoSm), which instructs LLMs to simulate code execution line by line/follow the pattern of compilers.
arXiv Detail & Related papers (2024-01-17T09:23:59Z) - In Situ Framework for Coupling Simulation and Machine Learning with
Application to CFD [51.04126395480625]
Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations.
As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks.
This work offers a solution by simplifying this coupling and enabling in situ training and inference on heterogeneous clusters.
arXiv Detail & Related papers (2023-06-22T14:07:54Z) - A Modular Framework for Reinforcement Learning Optimal Execution [68.8204255655161]
We develop a modular framework for the application of Reinforcement Learning to the problem of Optimal Trade Execution.
The framework is designed with flexibility in mind, in order to ease the implementation of different simulation setups.
arXiv Detail & Related papers (2022-08-11T09:40:42Z) - DagSim: Combining DAG-based model structure with unconstrained data
types and relations for flexible, transparent, and modularized data
simulation [2.685173014586162]
We present DagSim, a Python-based framework for DAG-based data simulation without any constraints on variable types or functional relations.
A succinct YAML format for defining the simulation model structure promotes transparency.
We illustrate the capabilities of DagSim through use cases where metadata variables control shapes in an image and patterns in bio-sequences.
arXiv Detail & Related papers (2022-05-06T17:43:27Z) - Learning Robust State Abstractions for Hidden-Parameter Block MDPs [55.31018404591743]
We leverage ideas of common structure from the HiP-MDP setting to enable robust state abstractions inspired by Block MDPs.
We derive instantiations of this new framework for both multi-task reinforcement learning (MTRL) and meta-reinforcement learning (Meta-RL) settings.
arXiv Detail & Related papers (2020-07-14T17:25:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.