Related papers: PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models

PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models

URL: http://arxiv.org/abs/2505.22394v1
Date: Wed, 28 May 2025 14:23:30 GMT
Title: PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models
Authors: Fan Fei, Jiajun Tang, Fei-Peng Tian, Boxin Shi, Ping Tan,
Abstract summary: PacTure is a framework for generating physically-based rendering (PBR) material textures from an un-domain 3D mesh.<n>We introduce view packing, a novel technique that increases the effective resolution for each view.
Score: 73.4445896872942
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present PacTure, a novel framework for generating physically-based rendering (PBR) material textures from an untextured 3D mesh, a text description, and an optional image prompt. Early 2D generation-based texturing approaches generate textures sequentially from different views, resulting in long inference times and globally inconsistent textures. More recent approaches adopt multi-view generation with cross-view attention to enhance global consistency, which, however, limits the resolution for each view. In response to these weaknesses, we first introduce view packing, a novel technique that significantly increases the effective resolution for each view during multi-view generation without imposing additional inference cost, by formulating the arrangement of multi-view maps as a 2D rectangle bin packing problem. In contrast to UV mapping, it preserves the spatial proximity essential for image generation and maintains full compatibility with current 2D generative models. To further reduce the inference cost, we enable fine-grained control and multi-domain generation within the next-scale prediction autoregressive framework to create an efficient multi-view multi-domain generative backbone. Extensive experiments show that PacTure outperforms state-of-the-art methods in both quality of generated PBR textures and efficiency in training and inference.

Related papers

SeqTex: Generate Mesh Textures in Video Sequence [62.766839821764144]
We introduce SeqTex, a novel end-to-end framework for training 3D texture generative models.<n>We show that SeqTex achieves state-of-the-art performance on both image-conditioned and text-conditioned 3D texture generation tasks.
arXiv Detail & Related papers (2025-07-06T07:58:36Z)
FlexPainter: Flexible and Multi-View Consistent Texture Generation [15.727635740684157]
textbfFlexPainter is a novel texture generation pipeline that enables flexible multi-modal conditional guidance.<n>Our framework significantly outperforms state-of-the-art methods in both flexibility and generation quality.
arXiv Detail & Related papers (2025-06-03T08:36:03Z)
MVPainter: Accurate and Detailed 3D Texture Generation via Multi-View Diffusion with Geometric Control [1.8463601973573158]
We investigate 3D texture generation through the lens of three core dimensions: reference-texture alignment, geometry-texture consistency, and local texture quality.<n>We propose MVPainter, which employs data filtering and augmentation strategies to enhance texture fidelity and detail.<n>We extract physically-based rendering (PBR) attributes from the generated views to produce PBR meshes suitable for real-world rendering applications.
arXiv Detail & Related papers (2025-05-19T02:40:24Z)
RomanTex: Decoupling 3D-aware Rotary Positional Embedded Multi-Attention Network for Texture Synthesis [10.350576861948952]
RomanTex is a multiview-based texture generation framework that integrates a multi-attention network with an underlying 3D representation.<n>Our method achieves state-of-the-art results in texture quality and consistency.
arXiv Detail & Related papers (2025-03-24T17:56:11Z)
Pandora3D: A Comprehensive Framework for High-Quality 3D Shape and Texture Generation [56.862552362223425]
This report presents a comprehensive framework for generating high-quality 3D shapes and textures from diverse input prompts.<n>The framework consists of 3D shape generation and texture generation.<n>This report details the system architecture, experimental results, and potential future directions to improve and expand the framework.
arXiv Detail & Related papers (2025-02-20T04:22:30Z)
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D [63.9188712646076]
Texturing is a 3D asset production, which enhances the visual appeal and visual appeal. Despite recent advancements, methods often yield subpar results, primarily due to local discontinuities. We propose a novel framework called MVPaint, which can generate high-resolution, seamless multiview consistency.
arXiv Detail & Related papers (2024-11-04T17:59:39Z)
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation [61.040832373015014]
We propose Flex3D, a novel framework for generating high-quality 3D content from text, single images, or sparse view images. We employ a fine-tuned multi-view image diffusion model and a video diffusion model to generate a pool of candidate views, enabling a rich representation of the target 3D object. In the second stage, the curated views are fed into a Flexible Reconstruction Model (FlexRM), built upon a transformer architecture that can effectively process an arbitrary number of inputs.
arXiv Detail & Related papers (2024-10-01T17:29:43Z)
GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture Generation [35.04723374116026]
Large-scale text-to-image (T2I) models have shown astonishing results in text-to-image (T2I) generation. Applying these models to synthesize textures for 3D geometries remains challenging due to the domain gap between 2D images and textures on a 3D surface. We propose a novel text-to-texture synthesis framework that leverages pretrained diffusion models.
arXiv Detail & Related papers (2024-09-27T02:32:42Z)
High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views. Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.