PartDiffuser: Part-wise 3D Mesh Generation via Discrete Diffusion
- URL: http://arxiv.org/abs/2511.18801v1
- Date: Mon, 24 Nov 2025 06:11:21 GMT
- Title: PartDiffuser: Part-wise 3D Mesh Generation via Discrete Diffusion
- Authors: Yichen Yang, Hong Li, Haodong Zhu, Linin Yang, Guojun Lei, Sheng Xu, Baochang Zhang,
- Abstract summary: PartDiffuser is a novel semi-autoregressive diffusion framework for point-cloud-to-mesh generation.<n>PartDiffuser is based on the DiT architecture and introduces a part-aware cross-attention mechanism.<n> Experiments demonstrate that this method significantly outperforms state-of-the-art (SOTA) models in generating 3D meshes with rich detail.
- Score: 14.879669869466072
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing autoregressive (AR) methods for generating artist-designed meshes struggle to balance global structural consistency with high-fidelity local details, and are susceptible to error accumulation. To address this, we propose PartDiffuser, a novel semi-autoregressive diffusion framework for point-cloud-to-mesh generation. The method first performs semantic segmentation on the mesh and then operates in a "part-wise" manner: it employs autoregression between parts to ensure global topology, while utilizing a parallel discrete diffusion process within each semantic part to precisely reconstruct high-frequency geometric features. PartDiffuser is based on the DiT architecture and introduces a part-aware cross-attention mechanism, using point clouds as hierarchical geometric conditioning to dynamically control the generation process, thereby effectively decoupling the global and local generation tasks. Experiments demonstrate that this method significantly outperforms state-of-the-art (SOTA) models in generating 3D meshes with rich detail, exhibiting exceptional detail representation suitable for real-world applications.
Related papers
- StdGEN++: A Comprehensive System for Semantic-Decomposed 3D Character Generation [57.06461272772509]
StdGEN++ is a novel and comprehensive system for generating high-fidelity, semantically decomposed 3D characters from diverse inputs.<n>It achieves state-of-the-art performance, significantly outperforming existing methods in geometric accuracy and semantic disentanglement.<n>The resulting structural independence unlocks advanced downstream capabilities, including non-destructive editing, physics-compliant animation, and gaze tracking.
arXiv Detail & Related papers (2026-01-12T15:41:27Z) - LoG3D: Ultra-High-Resolution 3D Shape Modeling via Local-to-Global Partitioning [26.88556500272625]
We propose a novel 3D variational autoencoder framework built upon unsigned distance fields (UDFs)<n>Our core innovation is a local-to-global architecture that processes the UDF by partitioning it into uniform subvolumes, UBlocks.<n> Experiments demonstrate state-of-the-art performance in both reconstruction accuracy and generative quality, yielding superior surface smoothness and geometric flexibility.
arXiv Detail & Related papers (2025-11-13T07:34:43Z) - Topology Sculptor, Shape Refiner: Discrete Diffusion Model for High-Fidelity 3D Meshes Generation [14.55646181682844]
Topology Sculptor, Shape Refiner (TSSR) is a novel method for generating high-quality, artist-style 3D meshes.<n>We leverage this parallel generation capability through three key innovations.<n> experiments on complex datasets demonstrate that TSSR generates high-quality 3D artist-style meshes.
arXiv Detail & Related papers (2025-10-24T08:51:48Z) - Kuramoto Orientation Diffusion Models [67.0711709825854]
Orientation-rich images, such as fingerprints and textures, often exhibit coherent angular patterns.<n>Motivated by the role of phase synchronization in biological systems, we propose a score-based generative model.<n>We implement competitive results on general image benchmarks and significantly improves generation quality on orientation-dense datasets like fingerprints and textures.
arXiv Detail & Related papers (2025-09-18T18:18:49Z) - HierOctFusion: Multi-scale Octree-based 3D Shape Generation via Part-Whole-Hierarchy Message Passing [9.953394373473621]
3D content generation remains a fundamental yet challenging task due to the inherent structural complexity of 3D data.<n>We propose HierOctFusion, a part-aware multi-scale octree diffusion model that enhances hierarchical feature interaction for generating fine-grained and sparse object structures.<n> Experiments demonstrate that HierOctFusion achieves superior shape quality and efficiency compared to prior methods.
arXiv Detail & Related papers (2025-08-14T23:12:18Z) - From Missing Pieces to Masterpieces: Image Completion with Context-Adaptive Diffusion [98.31811240195324]
ConFill is a novel framework that reduces discrepancies between generated and original images at each diffusion step.<n>It outperforms current methods, setting a new benchmark in image completion.
arXiv Detail & Related papers (2025-04-19T13:40:46Z) - MARS: Mesh AutoRegressive Model for 3D Shape Detailization [85.95365919236212]
We introduce MARS, a novel approach for 3D shape detailization.<n>We propose a mesh autoregressive model capable of generating such latent representations through next-LOD token prediction.<n>Experiments conducted on the challenging 3D Shape Detailization benchmark demonstrate that our proposed MARS model achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-02-17T03:12:16Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - Double-Shot 3D Shape Measurement with a Dual-Branch Network for Structured Light Projection Profilometry [14.749887303860717]
We propose a dual-branch Convolutional Neural Network (CNN)-Transformer network (PDCNet) to process different structured light (SL) modalities.<n>Within PDCNet, a Transformer branch is used to capture global perception in the fringe images, while a CNN branch is designed to collect local details in the speckle images.<n>Our method can reduce fringe order ambiguity while producing high-accuracy results on self-made datasets.
arXiv Detail & Related papers (2024-07-19T10:49:26Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.