TwinOR: Photorealistic Digital Twins of Dynamic Operating Rooms for Embodied AI Research
- URL: http://arxiv.org/abs/2511.07412v1
- Date: Mon, 10 Nov 2025 18:57:09 GMT
- Title: TwinOR: Photorealistic Digital Twins of Dynamic Operating Rooms for Embodied AI Research
- Authors: Han Zhang, Yiqing Shen, Roger D. Soberanis-Mukul, Ankita Ghosh, Hao Ding, Lalithkumar Seenivasan, Jose L. Porras, Zhekai Mao, Chenjia Li, Wenjie Xiao, Lonny Yarmus, Angela Christine Argento, Masaru Ishii, Mathias Unberath,
- Abstract summary: Developing embodied AI for intelligent surgical systems requires safe, controllable environments for continual learning and evaluation.<n>Digital twins provide high-fidelity, risk-free environments for exploration and training.<n>We introduce TwinOR, a framework for constructing photorealistic, dynamic digital twins of operating rooms for embodied AI research.
- Score: 9.65694006177344
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Developing embodied AI for intelligent surgical systems requires safe, controllable environments for continual learning and evaluation. However, safety regulations and operational constraints in operating rooms (ORs) limit embodied agents from freely perceiving and interacting in realistic settings. Digital twins provide high-fidelity, risk-free environments for exploration and training. How we may create photorealistic and dynamic digital representations of ORs that capture relevant spatial, visual, and behavioral complexity remains unclear. We introduce TwinOR, a framework for constructing photorealistic, dynamic digital twins of ORs for embodied AI research. The system reconstructs static geometry from pre-scan videos and continuously models human and equipment motion through multi-view perception of OR activities. The static and dynamic components are fused into an immersive 3D environment that supports controllable simulation and embodied exploration. The proposed framework reconstructs complete OR geometry with centimeter level accuracy while preserving dynamic interaction across surgical workflows, enabling realistic renderings and a virtual playground for embodied AI systems. In our experiments, TwinOR simulates stereo and monocular sensor streams for geometry understanding and visual localization tasks. Models such as FoundationStereo and ORB-SLAM3 on TwinOR-synthesized data achieve performance within their reported accuracy on real indoor datasets, demonstrating that TwinOR provides sensor-level realism sufficient for perception and localization challenges. By establishing a real-to-sim pipeline for constructing dynamic, photorealistic digital twins of OR environments, TwinOR enables the safe, scalable, and data-efficient development and benchmarking of embodied AI, ultimately accelerating the deployment of embodied AI from sim-to-real.
Related papers
- Mirage2Matter: A Physically Grounded Gaussian World Model from Video [87.9732484393686]
We present Simulate Anything, a graphics-driven world modeling and simulation framework.<n>Our approach reconstructs real-world environments into a photorealistic scene representation using 3D Gaussian Splatting (3DGS)<n>We then leverage generative models to recover a physically realistic representation and integrate it into a simulation environment via a precision calibration target.
arXiv Detail & Related papers (2026-01-24T07:43:57Z) - Dyna-Mind: Learning to Simulate from Experience for Better AI Agents [62.21219817256246]
We argue that current AI agents need ''vicarious trial and error'' - the capacity to mentally simulate alternative futures before acting.<n>We introduce Dyna-Mind, a two-stage training framework that explicitly teaches (V)LM agents to integrate such simulation into their reasoning.
arXiv Detail & Related papers (2025-10-10T17:30:18Z) - R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation [78.26308457952636]
This paper introduces R3D2, a lightweight, one-step diffusion model designed to overcome limitations in autonomous driving simulation.<n>It enables realistic insertion of complete 3D assets into existing scenes by generating plausible rendering effects-such as shadows and consistent lighting-in real time.<n>We show that R3D2 significantly enhances the realism of inserted assets, enabling use-cases like text-to-3D asset insertion and cross-scene/dataset object transfer.
arXiv Detail & Related papers (2025-06-09T14:50:19Z) - RealEngine: Simulating Autonomous Driving in Realistic Context [60.55873455475112]
RealEngine is a novel driving simulation framework that holistically integrates 3D scene reconstruction and novel view synthesis techniques.<n>By leveraging real-world multi-modal sensor data, RealEngine reconstructs background scenes and foreground traffic participants separately, allowing for highly diverse and realistic traffic scenarios.<n>RealEngine supports three essential driving simulation categories: non-reactive simulation, safety testing, and multi-agent interaction.
arXiv Detail & Related papers (2025-05-22T17:01:00Z) - DRAWER: Digital Reconstruction and Articulation With Environment Realism [42.13130021795826]
We present DRAWER, a novel framework that converts a video of a static indoor scene into a photorealistic and interactive digital environment.<n>We demonstrate the potential of DRAWER by using it to automatically create an interactive game in Unreal Engine and to enable real-to-sim-to-real transfer for robotics applications.
arXiv Detail & Related papers (2025-04-21T17:59:49Z) - VR-Robo: A Real-to-Sim-to-Real Framework for Visual Robot Navigation and Locomotion [25.440573256776133]
This paper presents a Real-to-Sim-to-Real framework that generates and physically interactive "digital twin" simulation environments for visual navigation and locomotion learning.
arXiv Detail & Related papers (2025-02-03T17:15:05Z) - UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI [37.47562766916571]
We introduce UnrealZoo, a collection of over 100 photo-realistic 3D virtual worlds built on Unreal Engine.<n>We also provide a rich variety of playable entities, including humans, animals, robots, and vehicles for embodied AI research.
arXiv Detail & Related papers (2024-12-30T14:31:01Z) - GeoSim: Photorealistic Image Simulation with Geometry-Aware Composition [81.24107630746508]
We present GeoSim, a geometry-aware image composition process that synthesizes novel urban driving scenes.
We first build a diverse bank of 3D objects with both realistic geometry and appearance from sensor data.
The resulting synthetic images are photorealistic, traffic-aware, and geometrically consistent, allowing image simulation to scale to complex use cases.
arXiv Detail & Related papers (2021-01-16T23:00:33Z) - ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation [75.0278287071591]
ThreeDWorld (TDW) is a platform for interactive multi-modal physical simulation.
TDW enables simulation of high-fidelity sensory data and physical interactions between mobile agents and objects in rich 3D environments.
We present initial experiments enabled by TDW in emerging research directions in computer vision, machine learning, and cognitive science.
arXiv Detail & Related papers (2020-07-09T17:33:27Z) - RoboTHOR: An Open Simulation-to-Real Embodied AI Platform [56.50243383294621]
We introduce RoboTHOR to democratize research in interactive and embodied visual AI.
We show there exists a significant gap between the performance of models trained in simulation when they are tested in both simulations and their carefully constructed physical analogs.
arXiv Detail & Related papers (2020-04-14T20:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.