TongSIM: A General Platform for Simulating Intelligent Machines
- URL: http://arxiv.org/abs/2512.20206v1
- Date: Tue, 23 Dec 2025 10:00:43 GMT
- Title: TongSIM: A General Platform for Simulating Intelligent Machines
- Authors: Zhe Sun, Kunlun Wu, Chuanjian Fu, Zeming Song, Langyong Shi, Zihe Xue, Bohan Jing, Ying Yang, Xiaomeng Gao, Aijia Li, Tianyu Guo, Huiying Li, Xueyuan Yang, Rongkai Liu, Xinyi He, Yuxi Wang, Yue Li, Mingyuan Liu, Yujie Lu, Hongzhao Xie, Shiyun Zhao, Bo Dai, Wei Wang, Tao Yuan, Song-Chun Zhu, Yujia Peng, Zhenliang Zhang,
- Abstract summary: Embodied intelligence focuses on training agents within realistic simulated environments.<n>TongSIM is a high-fidelity, general-purpose platform for training and evaluating embodied agents.
- Score: 59.27575233453533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As artificial intelligence (AI) rapidly advances, especially in multimodal large language models (MLLMs), research focus is shifting from single-modality text processing to the more complex domains of multimodal and embodied AI. Embodied intelligence focuses on training agents within realistic simulated environments, leveraging physical interaction and action feedback rather than conventionally labeled datasets. Yet, most existing simulation platforms remain narrowly designed, each tailored to specific tasks. A versatile, general-purpose training environment that can support everything from low-level embodied navigation to high-level composite activities, such as multi-agent social simulation and human-AI collaboration, remains largely unavailable. To bridge this gap, we introduce TongSIM, a high-fidelity, general-purpose platform for training and evaluating embodied agents. TongSIM offers practical advantages by providing over 100 diverse, multi-room indoor scenarios as well as an open-ended, interaction-rich outdoor town simulation, ensuring broad applicability across research needs. Its comprehensive evaluation framework and benchmarks enable precise assessment of agent capabilities, such as perception, cognition, decision-making, human-robot cooperation, and spatial and social reasoning. With features like customized scenes, task-adaptive fidelity, diverse agent types, and dynamic environmental simulation, TongSIM delivers flexibility and scalability for researchers, serving as a unified platform that accelerates training, evaluation, and advancement toward general embodied intelligence.
Related papers
- Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning [62.499592503950026]
Large language model (LLM) have empowered autonomous agents to perform complex tasks that require multi-turn interactions with tools and environments.<n>We propose Agent World Model (AWM), a fully synthetic environment generation pipeline.<n>We scale to 1,000 environments covering everyday scenarios, in which agents can interact with rich toolsets.
arXiv Detail & Related papers (2026-02-10T18:55:41Z) - GAMMS: Graph based Adversarial Multiagent Modeling Simulator [15.681127447904322]
We present GAMMS (Graph based Adrial Multiagent Modeling Simulator), a lightweight yet scalable simulation framework.<n>GAMMS emphasizes five core objectives: scalability, ease of use, integration-first architecture, fast visualization feedback, and real-world grounding.<n>It enables efficient simulation of complex domains such as urban road networks and communication systems.
arXiv Detail & Related papers (2026-02-04T22:38:51Z) - FreeAskWorld: An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI [24.545163508739943]
FreeAskWorld is an interactive simulation framework that integrates large language models for high-level behavior planning and semantically grounded interaction.<n>Our framework supports scalable, realistic human-agent simulations and includes a modular data generation pipeline tailored for diverse embodied tasks.<n>We present and publicly release FreeAskWorld, a large-scale benchmark dataset comprising reconstructed environments, six diverse task types, 16 core object categories, 63,429 annotated sample frames, and more than 17 hours of interaction data.
arXiv Detail & Related papers (2025-11-17T15:58:46Z) - Dyna-Mind: Learning to Simulate from Experience for Better AI Agents [62.21219817256246]
We argue that current AI agents need ''vicarious trial and error'' - the capacity to mentally simulate alternative futures before acting.<n>We introduce Dyna-Mind, a two-stage training framework that explicitly teaches (V)LM agents to integrate such simulation into their reasoning.
arXiv Detail & Related papers (2025-10-10T17:30:18Z) - The Indispensable Role of User Simulation in the Pursuit of AGI [37.789218939871105]
We argue that realistic simulators provide the necessary environments for scalable evaluation, data generation for interactive learning, and fostering the adaptive capabilities central to Artificial General Intelligence (AGI)<n>This article elaborates on the critical role of user simulation for AGI, explores the interdisciplinary nature of building realistic simulators, identifies key challenges including those posed by large language models, and proposes a future research agenda.
arXiv Detail & Related papers (2025-09-23T18:12:45Z) - Towards General Agentic Intelligence via Environment Scaling [78.66355092082253]
Advanced agentic intelligence is a prerequisite for deploying Large Language Models in real-world applications.<n>We design a scalable framework that automatically constructs heterogeneous environments that are fully simulated.<n>Experiments on agentic benchmarks, tau-bench, tau2-Bench, and ACEBench, demonstrate that our trained model, AgentScaler, significantly enhances the function-calling capability of models.
arXiv Detail & Related papers (2025-09-16T17:57:20Z) - Simulation Agent: A Framework for Integrating Simulation and Large Language Models for Enhanced Decision-Making [0.7499722271664147]
Large language models (LLMs) provide intuitive, language-based interactions but can lack the structured, causal understanding required to reliably model complex real-world dynamics.<n>We introduce our simulation agent framework, a novel approach that integrates the strengths of both simulation models and LLMs.
arXiv Detail & Related papers (2025-05-19T22:27:18Z) - LMAgent: A Large-scale Multimodal Agents Society for Multi-user Simulation [66.52371505566815]
Large language models (LLMs)-based AI agents have made significant progress, enabling them to achieve human-like intelligence.<n>We present LMAgent, a very large-scale and multimodal agents society based on multimodal LLMs.<n>In LMAgent, besides chatting with friends, the agents can autonomously browse, purchase, and review products, even perform live streaming e-commerce.
arXiv Detail & Related papers (2024-12-12T12:47:09Z) - EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment [38.14321677323052]
Embodied artificial intelligence emphasizes the role of an agent's body in generating human-like behaviors.
In this paper, we construct a benchmark platform for embodied intelligence evaluation in real-world city environments.
arXiv Detail & Related papers (2024-10-12T17:49:26Z) - LEGENT: Open Platform for Embodied Agents [60.71847900126832]
We introduce LEGENT, an open, scalable platform for developing embodied agents using Large Language Models (LLMs) and Large Multimodal Models (LMMs)
LEGENT offers a rich, interactive 3D environment with communicable and actionable agents, paired with a user-friendly interface.
In experiments, an embryonic vision-language-action model trained on LEGENT-generated data surpasses GPT-4V in embodied tasks.
arXiv Detail & Related papers (2024-04-28T16:50:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.