MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation
- URL: http://arxiv.org/abs/2412.14148v1
- Date: Wed, 18 Dec 2024 18:45:35 GMT
- Title: MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation
- Authors: Shenhao Zhu, Lingteng Qiu, Xiaodong Gu, Zhengyi Zhao, Chao Xu, Yuxiao He, Zhe Li, Xiaoguang Han, Yao Yao, Xun Cao, Siyu Zhu, Weihao Yuan, Zilong Dong, Hao Zhu,
- Abstract summary: UNet-based diffusion models to generate multi-view physically rendering PBR maps but struggle with multi-view inconsistency, some 3D methods directly generate UV maps, issues due to the 3D data.
In the stage, we propose to generate PBR materials, where both the specially designed Transformer DiDi) model to generate PBR materials feature reference views.
- Score: 30.69364954074992
- License:
- Abstract: Existing 2D methods utilize UNet-based diffusion models to generate multi-view physically-based rendering (PBR) maps but struggle with multi-view inconsistency, while some 3D methods directly generate UV maps, encountering generalization issues due to the limited 3D data. To address these problems, we propose a two-stage approach, including multi-view generation and UV materials refinement. In the generation stage, we adopt a Diffusion Transformer (DiT) model to generate PBR materials, where both the specially designed multi-branch DiT and reference-based DiT blocks adopt a global attention mechanism to promote feature interaction and fusion between different views, thereby improving multi-view consistency. In addition, we adopt a PBR-based diffusion loss to ensure that the generated materials align with realistic physical principles. In the refinement stage, we propose a material-refined DiT that performs inpainting in empty areas and enhances details in UV space. Except for the normal condition, this refinement also takes the material map from the generation stage as an additional condition to reduce the learning difficulty and improve generalization. Extensive experiments show that our method achieves state-of-the-art performance in texturing 3D objects with PBR materials and provides significant advantages for graphics relighting applications. Project Page: https://lingtengqiu.github.io/2024/MCMat/
Related papers
- IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations [64.07859467542664]
Capturing geometric and material information from images remains a fundamental challenge in computer vision and graphics.
Traditional optimization-based methods often require hours of computational time to reconstruct geometry, material properties, and environmental lighting from dense multi-view inputs.
We introduce IDArb, a diffusion-based model designed to perform intrinsic decomposition on an arbitrary number of images under varying illuminations.
arXiv Detail & Related papers (2024-12-16T18:52:56Z) - TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting [48.97819552366636]
This paper presents TexGaussian, a novel method that uses octant-aligned 3D Gaussian Splatting for rapid PBR material generation.
Our method synthesizes more visually pleasing PBR materials and runs faster than previous methods in both unconditional and text-conditional scenarios.
arXiv Detail & Related papers (2024-11-29T12:19:39Z) - Boosting 3D Object Generation through PBR Materials [32.732511476490316]
We propose a novel approach to boost the quality of generated 3D objects from the perspective of Physics-Based Rendering (PBR) materials.
For albedo and bump maps, we leverage Stable Diffusion fine-tuned on synthetic data to extract these values.
In terms of roughness and metalness maps, we adopt a semi-automatic process to provide room for interactive adjustment.
arXiv Detail & Related papers (2024-11-25T04:20:52Z) - Material Anything: Generating Materials for Any 3D Object via Diffusion [39.46553064506517]
We present a fully-automated, unified diffusion framework designed to generate physically-based materials for 3D objects.
Material Anything offers a robust, end-to-end solution adaptable to objects under diverse lighting conditions.
arXiv Detail & Related papers (2024-11-22T18:59:39Z) - MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D [63.9188712646076]
Texturing is a 3D asset production, which enhances the visual appeal and visual appeal.
Despite recent advancements, methods often yield subpar results, primarily due to local discontinuities.
We propose a novel framework called MVPaint, which can generate high-resolution, seamless multiview consistency.
arXiv Detail & Related papers (2024-11-04T17:59:39Z) - Vivid-ZOO: Multi-View Video Generation with Diffusion Model [76.96449336578286]
New challenges lie in the lack of massive captioned multi-view videos and the complexity of modeling such multi-dimensional distribution.
We propose a novel diffusion-based pipeline that generates high-quality multi-view videos centered around a dynamic 3D object from text.
arXiv Detail & Related papers (2024-06-12T21:44:04Z) - Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model [65.58911408026748]
We propose Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts.
We first advocate leveraging text-guided 4-view images as the bottleneck in the text-to-3D pipeline.
We then introduce an attention refocusing mechanism to encourage text-aligned 4-view image generation.
arXiv Detail & Related papers (2024-04-28T04:05:10Z) - DreamPBR: Text-driven Generation of High-resolution SVBRDF with Multi-modal Guidance [9.214785726215942]
We propose a novel diffusion-based generative framework designed to create spatially-varying appearance properties guided by text and multi-modal controls.
Key to achieving diverse and high-quality PBR material generation lies in integrating the capabilities of recent large-scale vision-language models trained on billions of text-image pairs.
We demonstrate the effectiveness of DreamPBR in material creation, showcasing its versatility and user-friendliness on a wide range of controllable generation and editing applications.
arXiv Detail & Related papers (2024-04-23T02:04:53Z) - UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation [101.2317840114147]
We present UniDream, a text-to-3D generation framework by incorporating unified diffusion priors.
Our approach consists of three main components: (1) a dual-phase training process to get albedo-normal aligned multi-view diffusion and reconstruction models, (2) a progressive generation procedure for geometry and albedo-textures based on Score Distillation Sample (SDS) using the trained reconstruction and diffusion models, and (3) an innovative application of SDS for finalizing PBR generation while keeping a fixed albedo based on Stable Diffusion model.
arXiv Detail & Related papers (2023-12-14T09:07:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.