Improved 3D Scene Stylization via Text-Guided Generative Image Editing with Region-Based Control
- URL: http://arxiv.org/abs/2509.05285v1
- Date: Thu, 04 Sep 2025 15:01:01 GMT
- Title: Improved 3D Scene Stylization via Text-Guided Generative Image Editing with Region-Based Control
- Authors: Haruo Fujiwara, Yusuke Mukuta, Tatsuya Harada,
- Abstract summary: We introduce techniques that enhance the quality of 3D stylization while maintaining view consistency and providing optional region-controlled style transfer.<n>Our method achieves stylization by re-training an initial 3D representation using stylized multi-view 2D images of the source views.<n>We propose Multi-Region Importance-Weighted Sliced Wasserstein Distance Loss, allowing styles to be applied to distinct image regions using segmentation masks from off-the-shelf models.
- Score: 47.14550252881733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in text-driven 3D scene editing and stylization, which leverage the powerful capabilities of 2D generative models, have demonstrated promising outcomes. However, challenges remain in ensuring high-quality stylization and view consistency simultaneously. Moreover, applying style consistently to different regions or objects in the scene with semantic correspondence is a challenging task. To address these limitations, we introduce techniques that enhance the quality of 3D stylization while maintaining view consistency and providing optional region-controlled style transfer. Our method achieves stylization by re-training an initial 3D representation using stylized multi-view 2D images of the source views. Therefore, ensuring both style consistency and view consistency of stylized multi-view images is crucial. We achieve this by extending the style-aligned depth-conditioned view generation framework, replacing the fully shared attention mechanism with a single reference-based attention-sharing mechanism, which effectively aligns style across different viewpoints. Additionally, inspired by recent 3D inpainting methods, we utilize a grid of multiple depth maps as a single-image reference to further strengthen view consistency among stylized images. Finally, we propose Multi-Region Importance-Weighted Sliced Wasserstein Distance Loss, allowing styles to be applied to distinct image regions using segmentation masks from off-the-shelf models. We demonstrate that this optional feature enhances the faithfulness of style transfer and enables the mixing of different styles across distinct regions of the scene. Experimental evaluations, both qualitative and quantitative, demonstrate that our pipeline effectively improves the results of text-driven 3D stylization.
Related papers
- DiffStyle3D: Consistent 3D Gaussian Stylization via Attention Optimization [22.652699040654046]
3D style transfer enables the creation of visually expressive 3D content.<n>We propose DiffStyle3D, a novel diffusion-based paradigm for 3DGS style transfer.<n>We show that DiffStyle3D outperforms state-of-the-art methods, achieving higher stylization quality and visual realism.
arXiv Detail & Related papers (2026-01-27T15:41:11Z) - StyleSculptor: Zero-Shot Style-Controllable 3D Asset Generation with Texture-Geometry Dual Guidance [50.207322685527394]
StyleSculptor is a training-free approach for generating style-guided 3D assets from a content image and one or more style images.<n>It achieves style-guided 3D generation in a zero-shot manner, enabling fine-grained 3D style control.<n>In experiments, StyleSculptor outperforms existing baseline methods in producing high-fidelity 3D assets.
arXiv Detail & Related papers (2025-09-16T17:55:20Z) - SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer [57.723850794113055]
We propose a novel 3D style transfer pipeline that integrates prior knowledge from pretrained 2D diffusion models.<n>Our pipeline consists of two key stages: First, we leverage diffusion priors to generate stylized renderings of key viewpoints.<n>The second is instance-level style transfer, which effectively leverages instance-level consistency across stylized key views and transfers it onto the 3D representation.
arXiv Detail & Related papers (2025-09-04T16:40:44Z) - Multi-StyleGS: Stylizing Gaussian Splatting with Multiple Styles [45.648346391757336]
3D Gaussian Splatting(GS) has emerged as a promising and efficient method for realistic 3D scene modeling.<n>We introduce a novel 3D GS stylization solution termed Multi-StyleGS to tackle these challenges.
arXiv Detail & Related papers (2025-06-07T15:54:34Z) - Styl3R: Instant 3D Stylized Reconstruction for Arbitrary Scenes and Styles [10.472018360278085]
Current state-of-the-art 3D stylization methods typically involve computationally intensive test-time optimization to transfer artistic features into a pretrained representation.<n>We demonstrate a novel approach to achieve direct 3D stylization in less than a second using unposed sparse-view scene images and an arbitrary style image.
arXiv Detail & Related papers (2025-05-27T11:47:15Z) - Style3D: Attention-guided Multi-view Style Transfer for 3D Object Generation [9.212876623996475]
Style3D is a novel approach for generating stylized 3D objects from a content image and a style image.<n>By establishing an interplay between structural and stylistic features across multiple views, our approach enables a holistic 3D stylization process.
arXiv Detail & Related papers (2024-12-04T18:59:38Z) - Towards Multi-View Consistent Style Transfer with One-Step Diffusion via Vision Conditioning [12.43848969320173]
Stylized images from different viewpoints generated by our method achieve superior visual quality, with better structural integrity and less distortion.
Our method effectively preserves the structural information and multi-view consistency in stylized images without any 3D information.
arXiv Detail & Related papers (2024-11-15T12:02:07Z) - Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images [54.56070204172398]
We propose a simple yet effective pipeline for stylizing a 3D scene.
We perform 3D style transfer by refining the source NeRF model using stylized images generated by a style-aligned image-to-image diffusion model.
We demonstrate that our method can transfer diverse artistic styles to real-world 3D scenes with competitive quality.
arXiv Detail & Related papers (2024-06-19T09:36:18Z) - Learning to Stylize Novel Views [82.24095446809946]
We tackle a 3D scene stylization problem - generating stylized images of a scene from arbitrary novel views.
We propose a point cloud-based method for consistent 3D scene stylization.
arXiv Detail & Related papers (2021-05-27T23:58:18Z) - 3DSNet: Unsupervised Shape-to-Shape 3D Style Transfer [66.48720190245616]
We propose a learning-based approach for style transfer between 3D objects.
The proposed method can synthesize new 3D shapes both in the form of point clouds and meshes.
We extend our technique to implicitly learn the multimodal style distribution of the chosen domains.
arXiv Detail & Related papers (2020-11-26T16:59:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.