Related papers: Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image

Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image

URL: http://arxiv.org/abs/2505.14537v2
Date: Tue, 05 Aug 2025 09:48:08 GMT
Title: Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image
Authors: Yuxuan Wang, Xuanyu Yi, Qingshan Xu, Yuan Zhou, Long Chen, Hanwang Zhang,
Abstract summary: We present Consistent Personalization for 3D Gaussian Splatting (CP-GS), a framework that propagates the single-view reference appearance to novel perspectives.<n>In particular, CP-GS integrates pre-trained image-to-3D generation and iterative LoRA fine-tuning to extract and extend the reference appearance.
Score: 56.134832639494185
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Personalizing 3D scenes from a single reference image enables intuitive user-guided editing, which requires achieving both multi-view consistency across perspectives and referential consistency with the input image. However, these goals are particularly challenging due to the viewpoint bias caused by the limited perspective provided in a single image. Lacking the mechanisms to effectively expand reference information beyond the original view, existing methods of image-conditioned 3DGS personalization often suffer from this viewpoint bias and struggle to produce consistent results. Therefore, in this paper, we present Consistent Personalization for 3D Gaussian Splatting (CP-GS), a framework that progressively propagates the single-view reference appearance to novel perspectives. In particular, CP-GS integrates pre-trained image-to-3D generation and iterative LoRA fine-tuning to extract and extend the reference appearance, and finally produces faithful multi-view guidance images and the personalized 3DGS outputs through a view-consistent generation process guided by geometric cues. Extensive experiments on real-world scenes show that our CP-GS effectively mitigates the viewpoint bias, achieving high-quality personalization that significantly outperforms existing methods. The code will be released at https://github.com/Yuxuan-W/CP-GS.

Related papers

Intern-GS: Vision Model Guided Sparse-View 3D Gaussian Splatting [95.61137026932062]
Intern-GS is a novel approach to enhance the process of sparse-view Gaussian splatting.<n>We show that Intern-GS achieves state-of-the-art rendering quality across diverse datasets.
arXiv Detail & Related papers (2025-05-27T05:17:49Z)
ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single Image [4.366356163044466]
Existing methods are often limited to reconstruct low-consistency 3D scenes with narrow fields of view from single-view input.<n>We propose ExScene, a two-stage pipeline to reconstruct an immersive 3D scene from any given single-view image.<n>ExScene achieves consistent and immersive scene reconstruction using only single-view input.
arXiv Detail & Related papers (2025-03-31T09:33:22Z)
HuGDiffusion: Generalizable Single-Image Human Rendering via 3D Gaussian Diffusion [50.02316409061741]
HuGDiffusion is a learning pipeline to achieve novel view synthesis (NVS) of human characters from single-view input images.<n>We aim to generate the set of 3DGS attributes via a diffusion-based framework conditioned on human priors extracted from a single image.<n>Our HuGDiffusion shows significant performance improvements over the state-of-the-art methods.
arXiv Detail & Related papers (2025-01-25T01:00:33Z)
CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image [18.445769892372528]
We introduce CATSplat, a novel generalizable transformer-based framework for single-view 3D scene reconstruction.<n>By incorporating scene-specific contextual details from text embeddings through cross-attention, we pave the way for context-aware reconstruction.<n>Experiments on large-scale datasets demonstrate the state-of-the-art performance of CATSplat in single-view 3D scene reconstruction.
arXiv Detail & Related papers (2024-12-17T13:32:04Z)
HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting [47.67153284714988]
We propose a novel hybrid representation, termed as HybridGS, using 2D Gaussians for transient objects per image.<n>We also propose a straightforward yet effective multi-stage training strategy to ensure robust training and high-quality view synthesis.<n> Experiments on benchmark datasets show our state-of-the-art performance of novel view synthesis in both indoor and outdoor scenes.
arXiv Detail & Related papers (2024-12-05T03:20:35Z)
SplatFormer: Point Transformer for Robust 3D Gaussian Splatting [18.911307036504827]
3D Gaussian Splatting (3DGS) has recently transformed photorealistic reconstruction, achieving high visual fidelity and real-time performance.<n> rendering quality significantly deteriorates when test views deviate from the camera angles used during training, posing a major challenge for applications in immersive free-viewpoint rendering and navigation.<n>We introduce SplatFormer, the first point transformer model specifically designed to operate on Gaussian splats.<n>Our model significantly improves rendering quality under extreme novel views, achieving state-of-the-art performance in these challenging scenarios and outperforming various 3DGS regularization techniques, multi-scene models tailored for sparse view synthesis, and diffusion
arXiv Detail & Related papers (2024-11-10T08:23:27Z)
PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices. Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z)
WE-GS: An In-the-wild Efficient 3D Gaussian Representation for Unconstrained Photo Collections [8.261637198675151]
Novel View Synthesis (NVS) from unconstrained photo collections is challenging in computer graphics. We propose an efficient point-based differentiable rendering framework for scene reconstruction from photo collections. Our approach outperforms existing approaches on the rendering quality of novel view and appearance synthesis with high converge and rendering speed.
arXiv Detail & Related papers (2024-06-04T15:17:37Z)
GeoGS3D: Single-view 3D Reconstruction via Geometric-aware Diffusion Model and Gaussian Splatting [81.03553265684184]
We introduce GeoGS3D, a framework for reconstructing detailed 3D objects from single-view images. We propose a novel metric, Gaussian Divergence Significance (GDS), to prune unnecessary operations during optimization. Experiments demonstrate that GeoGS3D generates images with high consistency across views and reconstructs high-quality 3D objects.
arXiv Detail & Related papers (2024-03-15T12:24:36Z)
Collaborative Score Distillation for Consistent Visual Synthesis [70.29294250371312]
Collaborative Score Distillation (CSD) is based on the Stein Variational Gradient Descent (SVGD) We show the effectiveness of CSD in a variety of tasks, encompassing the visual editing of panorama images, videos, and 3D scenes. Our results underline the competency of CSD as a versatile method for enhancing inter-sample consistency, thereby broadening the applicability of text-to-image diffusion models.
arXiv Detail & Related papers (2023-07-04T17:31:50Z)
High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views. Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.