Related papers: RL Dreams: Policy Gradient Optimization for Score Distillation based 3D Generation

RL Dreams: Policy Gradient Optimization for Score Distillation based 3D Generation

URL: http://arxiv.org/abs/2312.04806v1
Date: Fri, 8 Dec 2023 02:41:04 GMT
Title: RL Dreams: Policy Gradient Optimization for Score Distillation based 3D Generation
Authors: Aradhya N. Mathur, Phu Pham, Aniket Bera, Ojaswa Sharma
Abstract summary: Score Distillation Sampling (SDS) based rendering has improved 3D asset generation to a great extent. DDPO3D employs the policy gradient method in tandem with aesthetic scoring to improve 3D rendering from 2D diffusion models. Our approach is compatible with score distillation-based methods, which would facilitate the integration of diverse reward functions into the generative process.
Score: 15.154441074606101
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D generation has rapidly accelerated in the past decade owing to the progress in the field of generative modeling. Score Distillation Sampling (SDS) based rendering has improved 3D asset generation to a great extent. Further, the recent work of Denoising Diffusion Policy Optimization (DDPO) demonstrates that the diffusion process is compatible with policy gradient methods and has been demonstrated to improve the 2D diffusion models using an aesthetic scoring function. We first show that this aesthetic scorer acts as a strong guide for a variety of SDS-based methods and demonstrates its effectiveness in text-to-3D synthesis. Further, we leverage the DDPO approach to improve the quality of the 3D rendering obtained from 2D diffusion models. Our approach, DDPO3D, employs the policy gradient method in tandem with aesthetic scoring. To the best of our knowledge, this is the first method that extends policy gradient methods to 3D score-based rendering and shows improvement across SDS-based methods such as DreamGaussian, which are currently driving research in text-to-3D synthesis. Our approach is compatible with score distillation-based methods, which would facilitate the integration of diverse reward functions into the generative process. Our project page can be accessed via https://ddpo3d.github.io.

Related papers

Text-to-3D Generation by 2D Editing [17.17448279533487]
Distilling 3D representations from pretrained 2D diffusion models is essential for 3D creative applications across gaming, film, and interior design. Current SDS-based methods are hindered by inefficient information distillation from diffusion models, which prevents the creation of photorealistic 3D contents. We propose 3D Generation by Editing (GE3D), which exploits pretrained diffusion models to distill multi-granularity information through multiple denoising steps.
arXiv Detail & Related papers (2024-12-08T12:53:05Z)
MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification [13.872254142378772]
This paper introduces a unified framework for text-to-3D content generation. Our approach utilizes multi-view guidance to iteratively form the structure of the 3D model. We also introduce a novel densification algorithm that aligns gaussians close to the surface.
arXiv Detail & Related papers (2024-09-10T16:16:34Z)
VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation [69.68568248073747]
We propose Pose-dependent Consistency Distillation Sampling (PCDS), a novel yet efficient objective for diffusion-based 3D generation tasks. PCDS builds the pose-dependent consistency function within diffusion trajectories, allowing to approximate true gradients through minimal sampling steps. For efficient generation, we propose a coarse-to-fine optimization strategy, which first utilizes 1-step PCDS to create the basic structure of 3D objects, and then gradually increases PCDS steps to generate fine-grained details.
arXiv Detail & Related papers (2024-06-21T08:21:52Z)
LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation [73.36690511083894]
This paper introduces a novel framework called LN3Diff to address a unified 3D diffusion pipeline. Our approach harnesses a 3D-aware architecture and variational autoencoder to encode the input image into a structured, compact, and 3D latent space. It achieves state-of-the-art performance on ShapeNet for 3D generation and demonstrates superior performance in monocular 3D reconstruction and conditional 3D generation.
arXiv Detail & Related papers (2024-03-18T17:54:34Z)
BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View Diffusion [0.0]
BoostDream is a highly efficient plug-and-play 3D refining method designed to transform coarse 3D assets into high-quality. We introduce 3D model distillation that fits differentiable representations from the 3D assets obtained through feed-forward generation. A novel multi-view SDS loss is designed, which utilizes a multi-view aware 2D diffusion model to refine the 3D assets.
arXiv Detail & Related papers (2024-01-30T05:59:00Z)
Text-to-3D with Classifier Score Distillation [80.14832887529259]
Classifier-free guidance is considered an auxiliary trick rather than the most essential. We name this method Score Distillation (CSD), which can be interpreted as using an implicit classification model for generation. We validate the effectiveness of CSD across a variety of text-to-3D tasks including shape generation, texture synthesis, and shape editing.
arXiv Detail & Related papers (2023-10-30T10:25:40Z)
HiFi-123: Towards High-fidelity One Image to 3D Content Generation [64.81863143986384]
HiFi-123 is a method designed for high-fidelity and multi-view consistent 3D generation. We present a Reference-Guided Novel View Enhancement (RGNV) technique that significantly improves the fidelity of diffusion-based zero-shot novel view synthesis methods. We also present a novel Reference-Guided State Distillation (RGSD) loss.
arXiv Detail & Related papers (2023-10-10T16:14:20Z)
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation [55.661467968178066]
We propose DreamGaussian, a novel 3D content generation framework that achieves both efficiency and quality simultaneously. Our key insight is to design a generative 3D Gaussian Splatting model with companioned mesh extraction and texture refinement in UV space. In contrast to the occupancy pruning used in Neural Radiance Fields, we demonstrate that the progressive densification of 3D Gaussians converges significantly faster for 3D generative tasks.
arXiv Detail & Related papers (2023-09-28T17:55:05Z)
Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models. Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.