CLIPGaussian: Universal and Multimodal Style Transfer Based on Gaussian Splatting
- URL: http://arxiv.org/abs/2505.22854v1
- Date: Wed, 28 May 2025 20:41:24 GMT
- Title: CLIPGaussian: Universal and Multimodal Style Transfer Based on Gaussian Splatting
- Authors: Kornel Howil, Joanna Waczyńska, Piotr Borycki, Tadeusz Dziarmaga, Marcin Mazur, Przemysław Spurek,
- Abstract summary: We introduce CLIPGaussians, the first unified style transfer framework that supports text- and image-guided stylization across multiple modalities.<n>Our method operates directly on Gaussian primitives and integrates into existing GS pipelines as a plug-in module.<n>We demonstrate superior style fidelity and consistency across all tasks, validating CLIPGaussians as a universal and efficient solution for multimodal style transfer.
- Score: 0.42881773214459123
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Gaussian Splatting (GS) has recently emerged as an efficient representation for rendering 3D scenes from 2D images and has been extended to images, videos, and dynamic 4D content. However, applying style transfer to GS-based representations, especially beyond simple color changes, remains challenging. In this work, we introduce CLIPGaussians, the first unified style transfer framework that supports text- and image-guided stylization across multiple modalities: 2D images, videos, 3D objects, and 4D scenes. Our method operates directly on Gaussian primitives and integrates into existing GS pipelines as a plug-in module, without requiring large generative models or retraining from scratch. CLIPGaussians approach enables joint optimization of color and geometry in 3D and 4D settings, and achieves temporal coherence in videos, while preserving a model size. We demonstrate superior style fidelity and consistency across all tasks, validating CLIPGaussians as a universal and efficient solution for multimodal style transfer.
Related papers
- EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.<n>We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - Gaussian Billboards: Expressive 2D Gaussian Splatting with Textures [8.724367699416893]
We highlight the similarity between 2D Gaussian Splatting (2DGS) and billboards from traditional computer graphics.<n>We propose a modification of 2DGS to add spatially-varying color achieved using per-splat texture.<n>We show that our method can improve the sharpness and quality of the scene representation in a wide range of qualitative and quantitative evaluations.
arXiv Detail & Related papers (2024-12-17T09:57:04Z) - WaSt-3D: Wasserstein-2 Distance for Scene-to-Scene Stylization on 3D Gaussians [37.139479729087896]
We develop a new style transfer method for 3D scenes called WaSt-3D.
It faithfully transfers details from style scenes to the content scene without requiring any training.
WaSt-3D consistently delivers results across diverse content and style scenes without necessitating any training.
arXiv Detail & Related papers (2024-09-26T15:02:50Z) - Sketch3D: Style-Consistent Guidance for Sketch-to-3D Generation [55.73399465968594]
This paper proposes a novel generation paradigm Sketch3D to generate realistic 3D assets with shape aligned with the input sketch and color matching the textual description.
Three strategies are designed to optimize 3D Gaussians, i.e., structural optimization via a distribution transfer mechanism, color optimization with a straightforward MSE loss and sketch similarity optimization with a CLIP-based geometric similarity loss.
arXiv Detail & Related papers (2024-04-02T11:03:24Z) - Hybrid Explicit Representation for Ultra-Realistic Head Avatars [55.829497543262214]
We introduce a novel approach to creating ultra-realistic head avatars and rendering them in real-time.<n> UV-mapped 3D mesh is utilized to capture sharp and rich textures on smooth surfaces, while 3D Gaussian Splatting is employed to represent complex geometric structures.<n>Experiments that our modeled results exceed those of state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-18T04:01:26Z) - StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting [141.05924680451804]
StyleGaussian is a novel 3D style transfer technique.
It allows instant transfer of any image's style to a 3D scene at 10 frames per second (fps)
arXiv Detail & Related papers (2024-03-12T16:44:52Z) - DreamGaussian4D: Generative 4D Gaussian Splatting [56.49043443452339]
We introduce DreamGaussian4D (DG4D), an efficient 4D generation framework that builds on Gaussian Splatting (GS)
Our key insight is that combining explicit modeling of spatial transformations with static GS makes an efficient and powerful representation for 4D generation.
Video generation methods have the potential to offer valuable spatial-temporal priors, enhancing the high-quality 4D generation.
arXiv Detail & Related papers (2023-12-28T17:16:44Z) - Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed
Diffusion Models [94.07744207257653]
We focus on the underexplored text-to-4D setting and synthesize dynamic, animated 3D objects.
We combine text-to-image, text-to-video, and 3D-aware multiview diffusion models to provide feedback during 4D object optimization.
arXiv Detail & Related papers (2023-12-21T11:41:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.