EmbodiedSplat: Personalized Real-to-Sim-to-Real Navigation with Gaussian Splats from a Mobile Device
- URL: http://arxiv.org/abs/2509.17430v2
- Date: Tue, 23 Sep 2025 03:58:25 GMT
- Title: EmbodiedSplat: Personalized Real-to-Sim-to-Real Navigation with Gaussian Splats from a Mobile Device
- Authors: Gunjan Chhablani, Xiaomeng Ye, Muhammad Zubair Irshad, Zsolt Kira,
- Abstract summary: Embodied AI predominantly relies on simulation for training and evaluation.<n>Sim-to-real transfer remains a major challenge.<n>EmbodiedSplat is a novel approach that personalizes policy training.
- Score: 33.22697339175522
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The field of Embodied AI predominantly relies on simulation for training and evaluation, often using either fully synthetic environments that lack photorealism or high-fidelity real-world reconstructions captured with expensive hardware. As a result, sim-to-real transfer remains a major challenge. In this paper, we introduce EmbodiedSplat, a novel approach that personalizes policy training by efficiently capturing the deployment environment and fine-tuning policies within the reconstructed scenes. Our method leverages 3D Gaussian Splatting (GS) and the Habitat-Sim simulator to bridge the gap between realistic scene capture and effective training environments. Using iPhone-captured deployment scenes, we reconstruct meshes via GS, enabling training in settings that closely approximate real-world conditions. We conduct a comprehensive analysis of training strategies, pre-training datasets, and mesh reconstruction techniques, evaluating their impact on sim-to-real predictivity in real-world scenarios. Experimental results demonstrate that agents fine-tuned with EmbodiedSplat outperform both zero-shot baselines pre-trained on large-scale real-world datasets (HM3D) and synthetically generated datasets (HSSD), achieving absolute success rate improvements of 20% and 40% on real-world Image Navigation task. Moreover, our approach yields a high sim-vs-real correlation (0.87-0.97) for the reconstructed meshes, underscoring its effectiveness in adapting policies to diverse environments with minimal effort. Project page: https://gchhablani.github.io/embodied-splat.
Related papers
- SAGE: Scalable Agentic 3D Scene Generation for Embodied AI [67.43935343696982]
Existing scene-generation systems often rely on rule-based or task-specific pipelines, yielding artifacts and physically invalid scenes.<n>We present SAGE, an agentic framework that, given a user-specified embodied task, understands the intent and automatically generates simulation-ready environments at scale.<n>The resulting environments are realistic, diverse, and directly deployable in modern simulators for policy training.
arXiv Detail & Related papers (2026-02-10T18:59:55Z) - Mirage2Matter: A Physically Grounded Gaussian World Model from Video [87.9732484393686]
We present Simulate Anything, a graphics-driven world modeling and simulation framework.<n>Our approach reconstructs real-world environments into a photorealistic scene representation using 3D Gaussian Splatting (3DGS)<n>We then leverage generative models to recover a physically realistic representation and integrate it into a simulation environment via a precision calibration target.
arXiv Detail & Related papers (2026-01-24T07:43:57Z) - PolaRiS: Scalable Real-to-Sim Evaluations for Generalist Robot Policies [88.78188489161028]
We introduce Policy Evaluation and Environment Reconstruction in Simulation (PolaRiS)<n>PolaRiS is a scalable real-to-sim framework for high-fidelity simulated robot evaluation.<n>We show that PolaRiS evaluations provide a much stronger correlation to real world generalist policy performance than existing simulated benchmarks.
arXiv Detail & Related papers (2025-12-18T18:49:41Z) - Opening the Sim-to-Real Door for Humanoid Pixel-to-Action Policy Transfer [59.02729900344616]
GPU-accelerated, photorealistic simulation has opened a scalable data-generation path for robot learning.<n>We develop a teacher-student-bootstrap learning framework for vision-based humanoid loco-manipulation.<n>This represents the first humanoid sim-to-real policy capable of diverse articulated loco-manipulation using pure RGB perception.
arXiv Detail & Related papers (2025-11-30T20:07:13Z) - Real-to-Sim Robot Policy Evaluation with Gaussian Splatting Simulation of Soft-Body Interactions [27.247431258140463]
We present a real-to-sim policy evaluation framework that constructs soft-body digital twins from real-world videos.<n>We validate our approach on representative deformable manipulation tasks, including plush toy packing, rope routing, and T-block pushing.
arXiv Detail & Related papers (2025-11-06T18:52:08Z) - Synthetic vs. Real Training Data for Visual Navigation [6.5298097830674635]
This paper investigates how the performance of visual navigation policies trained in simulation compares to policies trained with real-world data.<n>We use a navigation policy architecture that bridges the sim-to-real appearance gap by leveraging pretrained visual representations and runs real-time on robot hardware.<n>Our results highlight the importance of diverse image encoder pretraining for sim-to-real generalization, and identify on-policy learning as a key advantage of simulated training over training with real data.
arXiv Detail & Related papers (2025-09-15T11:22:40Z) - Neural Fidelity Calibration for Informative Sim-to-Real Adaptation [10.117298045153564]
Deep reinforcement learning can seamlessly transfer agile locomotion and navigation skills from the simulator to real world.<n>However, bridging the sim-to-real gap with domain randomization or adversarial methods often demands expert physics knowledge to ensure policy robustness.<n>We propose Neural Fidelity (NFC), a novel framework that employs conditional score-based diffusion models to calibrate simulator physical coefficients and residual fidelity domains online during robot execution.
arXiv Detail & Related papers (2025-04-11T15:12:12Z) - Real-is-Sim: Bridging the Sim-to-Real Gap with a Dynamic Digital Twin [8.498460043101499]
We introduce real-is-sim, a new approach to integrating simulation into behavior cloning pipelines.<n>In contrast to real-only methods, which lack the ability to safely test policies before deployment, and sim-to-real methods, which require complex adaptation to cross the sim-to-real gap.<n>Our framework allows policies to seamlessly switch between running on real hardware and running in parallelized virtual environments.
arXiv Detail & Related papers (2025-04-04T17:05:56Z) - VR-Robo: A Real-to-Sim-to-Real Framework for Visual Robot Navigation and Locomotion [25.440573256776133]
This paper presents a Real-to-Sim-to-Real framework that generates and physically interactive "digital twin" simulation environments for visual navigation and locomotion learning.
arXiv Detail & Related papers (2025-02-03T17:15:05Z) - Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks [93.38375271826202]
We present a method to improve generalization and robustness to distribution shifts in sim-to-real visual quadrotor navigation tasks.
We first build a simulator by integrating Gaussian splatting with quadrotor flight dynamics, and then, train robust navigation policies using Liquid neural networks.
In this way, we obtain a full-stack imitation learning protocol that combines advances in 3D Gaussian splatting radiance field rendering, programming of expert demonstration training data, and the task understanding capabilities of Liquid networks.
arXiv Detail & Related papers (2024-06-21T13:48:37Z) - DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to
Reality [64.51295032956118]
We train a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand.
Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups.
arXiv Detail & Related papers (2022-10-25T01:51:36Z) - TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors [74.67698916175614]
We propose TrafficSim, a multi-agent behavior model for realistic traffic simulation.
In particular, we leverage an implicit latent variable model to parameterize a joint actor policy.
We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.
arXiv Detail & Related papers (2021-01-17T00:29:30Z) - Reactive Long Horizon Task Execution via Visual Skill and Precondition
Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner.
We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z) - SimAug: Learning Robust Representations from Simulation for Trajectory
Prediction [78.91518036949918]
We propose a novel approach to learn robust representation through augmenting the simulation training data.
We show that SimAug achieves promising results on three real-world benchmarks using zero real training data.
arXiv Detail & Related papers (2020-04-04T21:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.