FPGS: Feed-Forward Semantic-aware Photorealistic Style Transfer of Large-Scale Gaussian Splatting
- URL: http://arxiv.org/abs/2503.09635v1
- Date: Tue, 11 Mar 2025 23:52:56 GMT
- Title: FPGS: Feed-Forward Semantic-aware Photorealistic Style Transfer of Large-Scale Gaussian Splatting
- Authors: GeonU Kim, Kim Youwang, Lee Hyoseok, Tae-Hyun Oh,
- Abstract summary: FPGS is a feed-forward photorealistic style transfer method of large-scale radiance fields represented by Gaussian Splatting.<n>FPGS stylizes large-scale 3D scenes with arbitrary, multiple style reference images without additional optimization.<n>In experiments, we demonstrate that FPGS achieves favorable photorealistic quality scene stylization for large-scale static and dynamic 3D scenes.
- Score: 19.307856423583356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present FPGS, a feed-forward photorealistic style transfer method of large-scale radiance fields represented by Gaussian Splatting. FPGS, stylizes large-scale 3D scenes with arbitrary, multiple style reference images without additional optimization while preserving multi-view consistency and real-time rendering speed of 3D Gaussians. Prior arts required tedious per-style optimization or time-consuming per-scene training stage and were limited to small-scale 3D scenes. FPGS efficiently stylizes large-scale 3D scenes by introducing a style-decomposed 3D feature field, which inherits AdaIN's feed-forward stylization machinery, supporting arbitrary style reference images. Furthermore, FPGS supports multi-reference stylization with the semantic correspondence matching and local AdaIN, which adds diverse user control for 3D scene styles. FPGS also preserves multi-view consistency by applying semantic matching and style transfer processes directly onto queried features in 3D space. In experiments, we demonstrate that FPGS achieves favorable photorealistic quality scene stylization for large-scale static and dynamic 3D scenes with diverse reference images. Project page: https://kim-geonu.github.io/FPGS/
Related papers
- Visibility-Uncertainty-guided 3D Gaussian Inpainting via Scene Conceptional Learning [63.94919846010485]
3D Gaussian inpainting (3DGI) is challenging in effectively leveraging complementary visual and semantic cues from multiple input views.
We propose a method that measures the visibility uncertainties of 3D points across different input views and uses them to guide 3DGI.
We build a novel 3DGI framework, VISTA, by integrating VISibility-uncerTainty-guided 3DGI with scene conceptuAl learning.
arXiv Detail & Related papers (2025-04-23T06:21:11Z) - StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians [23.1385740508835]
StyleMe3D is a holistic framework for 3D GS style transfer.
It integrates multi-modal style conditioning, multi-level semantic alignment, and perceptual quality enhancement.
This work bridges photorealistic 3D GS and artistic stylization, unlocking applications in gaming, virtual worlds, and digital art.
arXiv Detail & Related papers (2025-04-21T17:59:55Z) - EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.
We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - Decoupling Appearance Variations with 3D Consistent Features in Gaussian Splatting [50.98884579463359]
We propose DAVIGS, a method that decouples appearance variations in a plug-and-play manner.<n>By transforming the rendering results at the image level instead of the Gaussian level, our approach can model appearance variations with minimal optimization time and memory overhead.<n>We validate our method on several appearance-variant scenes, and demonstrate that it achieves state-of-the-art rendering quality with minimal training time and memory usage.
arXiv Detail & Related papers (2025-01-18T14:55:58Z) - GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization [1.4466437171584356]
We propose a two-stage procedure that integrates dense and robust keypoint descriptors from the lightweight XFeat feature extractor into 3DGS.<n>In the second stage, the initial pose estimate is refined by minimizing the rendering-based photometric warp loss.<n> Benchmarking on widely used indoor and outdoor datasets demonstrates improvements over recent neural rendering-based localization methods.
arXiv Detail & Related papers (2024-09-24T23:18:32Z) - StyleSplat: 3D Object Style Transfer with Gaussian Splatting [0.3374875022248866]
Style transfer can enhance 3D assets with diverse artistic styles, transforming creative expression.
We introduce StyleSplat, a method for stylizing 3D objects in scenes represented by 3D Gaussians from reference style images.
We demonstrate its effectiveness across various 3D scenes and styles, showcasing enhanced control and customization in 3D creation.
arXiv Detail & Related papers (2024-07-12T17:55:08Z) - WE-GS: An In-the-wild Efficient 3D Gaussian Representation for Unconstrained Photo Collections [8.261637198675151]
Novel View Synthesis (NVS) from unconstrained photo collections is challenging in computer graphics.
We propose an efficient point-based differentiable rendering framework for scene reconstruction from photo collections.
Our approach outperforms existing approaches on the rendering quality of novel view and appearance synthesis with high converge and rendering speed.
arXiv Detail & Related papers (2024-06-04T15:17:37Z) - LP-3DGS: Learning to Prune 3D Gaussian Splatting [71.97762528812187]
We propose learning-to-prune 3DGS, where a trainable binary mask is applied to the importance score that can find optimal pruning ratio automatically.
Experiments have shown that LP-3DGS consistently produces a good balance that is both efficient and high quality.
arXiv Detail & Related papers (2024-05-29T05:58:34Z) - 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene.
Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z) - Gaussian Splatting in Style [32.41970914897462]
3D sceneization extends the work of neural style transfer to 3D.
A vital challenge in this problem is to maintain the uniformity of the stylized appearance across multiple views.
We propose a novel architecture trained on a collection of style images that, at test time, produces real time high-quality stylized novel views.
arXiv Detail & Related papers (2024-03-13T13:06:31Z) - StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting [141.05924680451804]
StyleGaussian is a novel 3D style transfer technique.
It allows instant transfer of any image's style to a 3D scene at 10 frames per second (fps)
arXiv Detail & Related papers (2024-03-12T16:44:52Z) - VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction [59.40711222096875]
We present VastGaussian, the first method for high-quality reconstruction and real-time rendering on large scenes based on 3D Gaussian Splatting.
Our approach outperforms existing NeRF-based methods and achieves state-of-the-art results on multiple large scene datasets.
arXiv Detail & Related papers (2024-02-27T11:40:50Z) - FPRF: Feed-Forward Photorealistic Style Transfer of Large-Scale 3D
Neural Radiance Fields [23.705795612467956]
FPRF stylizes large-scale 3D scenes with arbitrary, multiple style reference images without additional optimization.
FPRF achieves favorable photorealistic quality 3D scene stylization for large-scale scenes with diverse reference images.
arXiv Detail & Related papers (2024-01-10T19:27:28Z) - SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.