Related papers: Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting

Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting

URL: http://arxiv.org/abs/2312.13271v3
Date: Wed, 27 Dec 2023 10:51:27 GMT
Title: Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting
Authors: Junwu Zhang, Zhenyu Tang, Yatian Pang, Xinhua Cheng, Peng Jin, Yida Wei, Munan Ning, Li Yuan
Abstract summary: We present Repaint123 to alleviate multi-view bias as well as texture degradation and speed up the generation process. We propose visibility-aware adaptive repainting strength for overlap regions to enhance the generated image quality. Our method has a superior ability to generate high-quality 3D content with multi-view consistency and fine textures in 2 minutes from scratch.
Score: 16.957766297050707
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent one image to 3D generation methods commonly adopt Score Distillation Sampling (SDS). Despite the impressive results, there are multiple deficiencies including multi-view inconsistency, over-saturated and over-smoothed textures, as well as the slow generation speed. To address these deficiencies, we present Repaint123 to alleviate multi-view bias as well as texture degradation and speed up the generation process. The core idea is to combine the powerful image generation capability of the 2D diffusion model and the texture alignment ability of the repainting strategy for generating high-quality multi-view images with consistency. We further propose visibility-aware adaptive repainting strength for overlap regions to enhance the generated image quality in the repainting process. The generated high-quality and multi-view consistent images enable the use of simple Mean Square Error (MSE) loss for fast 3D content generation. We conduct extensive experiments and show that our method has a superior ability to generate high-quality 3D content with multi-view consistency and fine textures in 2 minutes from scratch. Our project page is available at https://pku-yuangroup.github.io/repaint123/.

Related papers

Pandora3D: A Comprehensive Framework for High-Quality 3D Shape and Texture Generation [56.862552362223425]
This report presents a comprehensive framework for generating high-quality 3D shapes and textures from diverse input prompts. The framework consists of 3D shape generation and texture generation. This report details the system architecture, experimental results, and potential future directions to improve and expand the framework.
arXiv Detail & Related papers (2025-02-20T04:22:30Z)
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D [63.9188712646076]
Texturing is a 3D asset production, which enhances the visual appeal and visual appeal. Despite recent advancements, methods often yield subpar results, primarily due to local discontinuities. We propose a novel framework called MVPaint, which can generate high-resolution, seamless multiview consistency.
arXiv Detail & Related papers (2024-11-04T17:59:39Z)
Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation [22.699173137070883]
Hunyuan3D-1.0 is a two-stage approach that supports text- and image-conditioned generation. In the first stage, we employ a multi-view diffusion model that efficiently generates multi-view RGB in approximately 4 seconds. In the second stage, we introduce a feed-forward reconstruction model that rapidly and faithfully reconstructs the 3D asset. Our framework involves the text-to-image model, i.e., Hunyuan-DiT, making it a unified framework to support both text- and image-conditioned 3D generation.
arXiv Detail & Related papers (2024-11-04T17:21:42Z)
Meta 3D TextureGen: Fast and Consistent Texture Generation for 3D Objects [54.80813150893719]
We introduce Meta 3D TextureGen: a new feedforward method comprised of two sequential networks aimed at generating high-quality textures in less than 20 seconds. Our method state-of-the-art results in quality and speed by conditioning a text-to-image model on 3D semantics in 2D space and fusing them into a complete and high-resolution UV texture map. In addition, we introduce a texture enhancement network that is capable of up-scaling any texture by an arbitrary ratio, producing 4k pixel resolution textures.
arXiv Detail & Related papers (2024-07-02T17:04:34Z)
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image [28.759158325097093]
Unique3D is a novel image-to-3D framework for efficiently generating high-quality 3D meshes from single-view images. Our framework features state-of-the-art generation fidelity and strong generalizability.
arXiv Detail & Related papers (2024-05-30T17:59:54Z)
Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion [88.02512124661884]
We propose Magic-Boost, a multi-view conditioned diffusion model that significantly refines coarse generative results. Compared to the previous text or single image based diffusion models, Magic-Boost exhibits a robust capability to generate images with high consistency. It provides precise SDS guidance that well aligns with the identity of the input images, enriching the local detail in both geometry and texture of the initial generative results.
arXiv Detail & Related papers (2024-04-09T16:20:03Z)
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation [96.32684334038278]
In this paper, we explore the design space of text-to-3D models. We significantly improve multi-view generation by considering video instead of image generators. Our new method, IM-3D, reduces the number of evaluations of the 2D generator network 10-100x.
arXiv Detail & Related papers (2024-02-13T18:59:51Z)
Efficient Geometry-aware 3D Generative Adversarial Networks [50.68436093869381]
Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent. In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations. We introduce an expressive hybrid explicit-implicit network architecture that synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry.
arXiv Detail & Related papers (2021-12-15T08:01:43Z)
OSTeC: One-Shot Texture Completion [86.23018402732748]
We propose an unsupervised approach for one-shot 3D facial texture completion. The proposed approach rotates an input image in 3D and fill-in the unseen regions by reconstructing the rotated image in a 2D face generator. We frontalize the target image by projecting the completed texture into the generator.
arXiv Detail & Related papers (2020-12-30T23:53:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.