VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting
- URL: http://arxiv.org/abs/2503.12383v1
- Date: Sun, 16 Mar 2025 07:03:13 GMT
- Title: VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting
- Authors: Songen Gu, Haoxuan Song, Binjie Liu, Qian Yu, Sanyi Zhang, Haiyong Jiang, Jin Huang, Feng Tian,
- Abstract summary: We propose VRSketch2Gaussian, a first VR sketch-guided, multi-modal, native 3D object generation framework.<n>VRSS is the first large-scale paired dataset containing VR sketches, text, images, and 3DGS.
- Score: 17.92139776515526
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose VRSketch2Gaussian, a first VR sketch-guided, multi-modal, native 3D object generation framework that incorporates a 3D Gaussian Splatting representation. As part of our work, we introduce VRSS, the first large-scale paired dataset containing VR sketches, text, images, and 3DGS, bridging the gap in multi-modal VR sketch-based generation. Our approach features the following key innovations: 1) Sketch-CLIP feature alignment. We propose a two-stage alignment strategy that bridges the domain gap between sparse VR sketch embeddings and rich CLIP embeddings, facilitating both VR sketch-based retrieval and generation tasks. 2) Fine-Grained multi-modal conditioning. We disentangle the 3D generation process by using explicit VR sketches for geometric conditioning and text descriptions for appearance control. To facilitate this, we propose a generalizable VR sketch encoder that effectively aligns different modalities. 3) Efficient and high-fidelity 3D native generation. Our method leverages a 3D-native generation approach that enables fast and texture-rich 3D object synthesis. Experiments conducted on our VRSS dataset demonstrate that our method achieves high-quality, multi-modal VR sketch-based 3D generation. We believe our VRSS dataset and VRsketch2Gaussian method will be beneficial for the 3D generation community.
Related papers
- EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.
We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs [33.74118487769923]
We introduce GSemSplat, a framework that learns semantic representations linked to 3D Gaussians without per-scene optimization, dense image collections or calibration.<n>We employ a dual-feature approach that leverages both region-specific and context-aware semantic features as supervision in the 2D space.
arXiv Detail & Related papers (2024-12-22T09:06:58Z) - DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts.
Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets.
We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z) - PonderV2: Pave the Way for 3D Foundation Model with A Universal
Pre-training Paradigm [114.47216525866435]
We introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation.
For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness.
arXiv Detail & Related papers (2023-10-12T17:59:57Z) - Investigating Input Modality and Task Geometry on Precision-first 3D
Drawing in Virtual Reality [16.795850221628033]
We investigated how task geometric shapes and input modalities affect precision-first drawing performance.
We found that compared to using bare hands, VR controllers and pens yield nearly 30% of precision gain.
arXiv Detail & Related papers (2022-10-21T21:56:43Z) - Towards 3D VR-Sketch to 3D Shape Retrieval [128.47604316459905]
We study the use of 3D sketches as an input modality and advocate a VR-scenario where retrieval is conducted.
As a first stab at this new 3D VR-sketch to 3D shape retrieval problem, we make four contributions.
arXiv Detail & Related papers (2022-09-20T22:04:31Z) - Fine-Grained VR Sketching: Dataset and Insights [140.0579567561475]
We present the first fine-grained dataset of 1,497 3D VR sketch and 3D shape pairs of a chair category with large shapes diversity.
Our dataset supports the recent trend in the sketch community on fine-grained data analysis.
arXiv Detail & Related papers (2022-09-20T21:30:54Z) - Structure-Aware 3D VR Sketch to 3D Shape Retrieval [113.20120789493217]
We focus on the challenge caused by inherent inaccuracies in 3D VR sketches.
We propose to use a triplet loss with an adaptive margin value driven by a "fitting gap"
We introduce a dataset of 202 VR sketches for 202 3D shapes drawn from memory rather than from observation.
arXiv Detail & Related papers (2022-09-19T14:29:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.