Rethinking Score Distilling Sampling for 3D Editing and Generation
- URL: http://arxiv.org/abs/2505.01888v1
- Date: Sat, 03 May 2025 18:40:39 GMT
- Title: Rethinking Score Distilling Sampling for 3D Editing and Generation
- Authors: Xingyu Miao, Haoran Duan, Yang Long, Jungong Han,
- Abstract summary: Unified Distillation Sampling (UDS) is a method that seamlessly integrates the generation and editing of 3D assets.<n>UDS not only outperforms baseline methods in generating 3D assets with richer details but also excels in editing tasks, thereby bridging the gap between 3D generation and editing.
- Score: 50.52808917055502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Score Distillation Sampling (SDS) has emerged as a prominent method for text-to-3D generation by leveraging the strengths of 2D diffusion models. However, SDS is limited to generation tasks and lacks the capability to edit existing 3D assets. Conversely, variants of SDS that introduce editing capabilities often can not generate new 3D assets effectively. In this work, we observe that the processes of generation and editing within SDS and its variants have unified underlying gradient terms. Building on this insight, we propose Unified Distillation Sampling (UDS), a method that seamlessly integrates both the generation and editing of 3D assets. Essentially, UDS refines the gradient terms used in vanilla SDS methods, unifying them to support both tasks. Extensive experiments demonstrate that UDS not only outperforms baseline methods in generating 3D assets with richer details but also excels in editing tasks, thereby bridging the gap between 3D generation and editing. The code is available on: https://github.com/xingy038/UDS.
Related papers
- RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling [14.725841457150414]
RewardSDS weights noise samples based on alignment scores from a reward model, producing a weighted SDS loss.<n>This loss prioritizes gradients from noise samples that yield aligned high-reward output.<n>We evaluate RewardSDS and RewardVSD on text-to-image, 2D editing, and text-to-3D generation tasks.
arXiv Detail & Related papers (2025-03-12T17:59:47Z) - Semantic Score Distillation Sampling for Compositional Text-to-3D Generation [28.88237230872795]
Generating high-quality 3D assets from textual descriptions remains a pivotal challenge in computer graphics and vision research.
We introduce a novel SDS approach, designed to improve the expressiveness and accuracy of compositional text-to-3D generation.
Our approach integrates new semantic embeddings that maintain consistency across different rendering views.
By leveraging explicit semantic guidance, our method unlocks the compositional capabilities of existing pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-11T17:26:00Z) - VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation [69.68568248073747]
We propose Pose-dependent Consistency Distillation Sampling (PCDS), a novel yet efficient objective for diffusion-based 3D generation tasks.
PCDS builds the pose-dependent consistency function within diffusion trajectories, allowing to approximate true gradients through minimal sampling steps.
For efficient generation, we propose a coarse-to-fine optimization strategy, which first utilizes 1-step PCDS to create the basic structure of 3D objects, and then gradually increases PCDS steps to generate fine-grained details.
arXiv Detail & Related papers (2024-06-21T08:21:52Z) - ExactDreamer: High-Fidelity Text-to-3D Content Creation via Exact Score Matching [10.362259643427526]
Current approaches often adapt pre-trained 2D diffusion models for 3D synthesis.
Over-smoothing poses a significant limitation on the high-fidelity generation of 3D models.
LucidDreamer replaces the Denoising Diffusion Probabilistic Model (DDPM) in SDS with the Denoising Diffusion Implicit Model (DDIM)
arXiv Detail & Related papers (2024-05-24T20:19:45Z) - Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning [52.81032340916171]
Coin3D allows users to control the 3D generation using a coarse geometry proxy assembled from basic shapes.
Our method achieves superior controllability and flexibility in the 3D assets generation task.
arXiv Detail & Related papers (2024-05-13T17:56:13Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - Stable Score Distillation for High-Quality 3D Generation [21.28421571320286]
We decompose Score Distillation Sampling (SDS) as a combination of three functional components, namely mode-seeking, mode-disengaging and variance-reducing terms.
We show that problems such as over-smoothness and implausibility result from the intrinsic deficiency of the first two terms.
We propose a simple yet effective approach named Stable Score Distillation (SSD) which strategically orchestrates each term for high-quality 3D generation.
arXiv Detail & Related papers (2023-12-14T19:18:38Z) - StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D [88.66678730537777]
We present StableDreamer, a methodology incorporating three advances.
First, we formalize the equivalence of the SDS generative prior and a simple supervised L2 reconstruction loss.
Second, our analysis shows that while image-space diffusion contributes to geometric precision, latent-space diffusion is crucial for vivid color rendition.
arXiv Detail & Related papers (2023-12-02T02:27:58Z) - Text-to-3D with Classifier Score Distillation [80.14832887529259]
Classifier-free guidance is considered an auxiliary trick rather than the most essential.
We name this method Score Distillation (CSD), which can be interpreted as using an implicit classification model for generation.
We validate the effectiveness of CSD across a variety of text-to-3D tasks including shape generation, texture synthesis, and shape editing.
arXiv Detail & Related papers (2023-10-30T10:25:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.