Related papers: Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

URL: http://arxiv.org/abs/2512.08478v1
Date: Tue, 09 Dec 2025 10:54:58 GMT
Title: Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Authors: Yuning Gong, Yifei Liu, Yifan Zhan, Muyao Niu, Xueying Li, Yuanjun Liao, Jiaming Chen, Yuanyuan Gao, Jiaqi Chen, Minming Chen, Li Zhou, Yuning Zhang, Wei Wang, Xiaoqing Hou, Huaxi Huang, Shixiang Tang, Le Ma, Dingwen Zhang, Xue Yang, Junchi Yan, Yanchi Zhang, Yinqiang Zheng, Xiao Sun, Zhihang Zhong,
Abstract summary: We present Visionary, an open, web-native platform for real-time various Gaussian Splatting and rendering.<n> Visionary enables dynamic neural processing while maintaining a lightweight, "click-to-run" browser experience.
Score: 104.39464309969253
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural rendering, particularly 3D Gaussian Splatting (3DGS), has evolved rapidly and become a key component for building world models. However, existing viewer solutions remain fragmented, heavy, or constrained by legacy pipelines, resulting in high deployment friction and limited support for dynamic content and generative models. In this work, we present Visionary, an open, web-native platform for real-time various Gaussian Splatting and meshes rendering. Built on an efficient WebGPU renderer with per-frame ONNX inference, Visionary enables dynamic neural processing while maintaining a lightweight, "click-to-run" browser experience. It introduces a standardized Gaussian Generator contract, which not only supports standard 3DGS rendering but also allows plug-and-play algorithms to generate or update Gaussians each frame. Such inference also enables us to apply feedforward generative post-processing. The platform further offers a plug in three.js library with a concise TypeScript API for seamless integration into existing web applications. Experiments show that, under identical 3DGS assets, Visionary achieves superior rendering efficiency compared to current Web viewers due to GPU-based primitive sorting. It already supports multiple variants, including MLP-based 3DGS, 4DGS, neural avatars, and style transformation or enhancement networks. By unifying inference and rendering directly in the browser, Visionary significantly lowers the barrier to reproduction, comparison, and deployment of 3DGS-family methods, serving as a unified World Model Carrier for both reconstructive and generative paradigms.

Related papers

Quantile Rendering: Efficiently Embedding High-dimensional Feature on 3D Gaussian Splatting [52.18697134979677]
Recent advancements in computer vision have successfully extended Open-vocabulary segmentation (OVS) to the 3D domain by leveraging 3D Gaussian Splatting (3D-GS)<n>Existing methods employ codebooks or feature compression, causing information loss, thereby degrading segmentation quality.<n>We introduce Quantile Rendering (Q-Render), a novel rendering strategy for 3D Gaussians that efficiently handles high-dimensional features while maintaining high fidelity.<n>Our framework outperforms state-of-the-art methods, while enabling real-time rendering with an approximate 43.7x speedup on 512-D feature maps.
arXiv Detail & Related papers (2025-12-24T04:16:18Z)
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction [45.95623374754385]
VolSplat is a new multi-view feed-forward paradigm that replaces pixel alignment with voxel-aligned Gaussians.<n>It overcomes pixel alignment's reliance on error-prone 2D feature matching, ensuring robust multi-view consistency.<n> Experiments on widely used benchmarks including RealEstate10K and ScanNet demonstrate that VolSplat achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-09-23T17:59:02Z)
FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting [57.97160965244424]
3D Gaussian splatting (3DGS) has enabled various applications in 3D scene representation and novel view synthesis.<n>Previous approaches have focused on pruning less important Gaussians, effectively compressing 3DGS.<n>We present an elastic inference method for 3DGS, achieving substantial rendering performance without additional fine-tuning.
arXiv Detail & Related papers (2025-06-04T17:17:57Z)
iVR-GS: Inverse Volume Rendering for Explorable Visualization via Editable 3D Gaussian Splatting [8.689359004580258]
This paper introduces inverse volume rendering via Gaussian splatting (iVR-GS)<n>iVR-GS reduces the rendering cost while enabling scene editing for interactive volume exploration.<n>We demonstrate the superior reconstruction quality and composability of iVR-GS against other NVS solutions.
arXiv Detail & Related papers (2025-04-24T21:56:53Z)
EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.<n>We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
Generative Gaussian Splatting: Generating 3D Scenes with Video Diffusion Priors [11.156009461711639]
Generative Gaussian Splatting (GGS) is a novel approach that integrates a 3D representation with a pre-trained latent video diffusion model.<n>We evaluate our approach on two common benchmark datasets for scene synthesis, RealEstate10K and ScanNet+.
arXiv Detail & Related papers (2025-03-17T15:24:04Z)
GaussRender: Learning 3D Occupancy with Gaussian Rendering [86.89653628311565]
GaussRender is a module that improves 3D occupancy learning by enforcing projective consistency.<n>Our method penalizes 3D configurations that produce inconsistent 2D projections, thereby enforcing a more coherent 3D structure.
arXiv Detail & Related papers (2025-02-07T16:07:51Z)
NovelGS: Consistent Novel-view Denoising via Large Gaussian Reconstruction Model [57.92709692193132]
NovelGS is a diffusion model for Gaussian Splatting given sparse-view images. We leverage the novel view denoising through a transformer-based network to generate 3D Gaussians.
arXiv Detail & Related papers (2024-11-25T07:57:17Z)
3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes [87.01284850604495]
We introduce 3D Convexting (3DCS), which leverages 3D smooth convexes as primitives for modeling geometrically-meaningful radiance fields from multiview images.<n>3DCS achieves superior performance over 3DGS on benchmarks such as MipNeizer, Tanks and Temples, and Deep Blending.<n>Our results highlight the potential of 3D Convexting to become the new standard for high-quality scene reconstruction.
arXiv Detail & Related papers (2024-11-22T14:31:39Z)
Gaussian Splatting Decoder for 3D-aware Generative Adversarial Networks [10.207899254360374]
NeRF-based 3D-aware Generative Adversarial Networks (GANs) have shown very high rendering quality under large representational variety. rendering with Neural Radiance Fields poses challenges for 3D applications. We present a novel approach that combines the high rendering quality of NeRF-based 3D-aware GANs with the flexibility and computational advantages of 3DGS.
arXiv Detail & Related papers (2024-04-16T14:48:40Z)
Gaussian Shell Maps for Efficient 3D Human Generation [96.25056237689988]
3D generative adversarial networks (GANs) have demonstrated state-of-the-art (SOTA) quality and diversity for generated assets. Current 3D GAN architectures, however, rely on volume representations, which are slow to render, thereby hampering the GAN training and requiring multi-view-inconsistent 2D upsamplers.
arXiv Detail & Related papers (2023-11-29T18:04:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.