DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
- URL: http://arxiv.org/abs/2503.15265v1
- Date: Wed, 19 Mar 2025 14:39:30 GMT
- Title: DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
- Authors: Ruowen Zhao, Junliang Ye, Zhengyi Wang, Guangce Liu, Yiwen Chen, Yikai Wang, Jun Zhu,
- Abstract summary: DeepMesh is a framework that optimize mesh generation through two key innovations.<n>It incorporates a novel tokenization algorithm, along with improvements in data curation and processing.<n>It generates meshes with intricate details and precise topology, outperforming state-of-the-art methods in both precision and quality.
- Score: 21.77406648840365
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Triangle meshes play a crucial role in 3D applications for efficient manipulation and rendering. While auto-regressive methods generate structured meshes by predicting discrete vertex tokens, they are often constrained by limited face counts and mesh incompleteness. To address these challenges, we propose DeepMesh, a framework that optimizes mesh generation through two key innovations: (1) an efficient pre-training strategy incorporating a novel tokenization algorithm, along with improvements in data curation and processing, and (2) the introduction of Reinforcement Learning (RL) into 3D mesh generation to achieve human preference alignment via Direct Preference Optimization (DPO). We design a scoring standard that combines human evaluation with 3D metrics to collect preference pairs for DPO, ensuring both visual appeal and geometric accuracy. Conditioned on point clouds and images, DeepMesh generates meshes with intricate details and precise topology, outperforming state-of-the-art methods in both precision and quality. Project page: https://zhaorw02.github.io/DeepMesh/
Related papers
- MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs [79.45006864728893]
MeshCraft is a framework for efficient and controllable mesh generation.
It uses continuous spatial diffusion to generate discrete triangle faces.
It can generate an 800-face mesh in just 3.2 seconds.
arXiv Detail & Related papers (2025-03-29T09:21:50Z) - DMesh++: An Efficient Differentiable Mesh for Complex Shapes [51.75054400014161]
We introduce a new differentiable mesh processing method in 2D and 3D.<n>We present an algorithm that adapts the mesh resolution to local geometry in 2D for efficient representation.<n>We demonstrate the effectiveness of our approach on 2D point cloud and 3D multi-view reconstruction tasks.
arXiv Detail & Related papers (2024-12-21T21:16:03Z) - ConvMesh: Reimagining Mesh Quality Through Convex Optimization [55.2480439325792]
This research introduces a convex optimization programming called disciplined convex programming to enhance existing meshes.<n>By focusing on a sparse set of point clouds from both the original and target meshes, this method demonstrates significant improvements in mesh quality with minimal data requirements.
arXiv Detail & Related papers (2024-12-11T15:48:25Z) - MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization [65.15226276553891]
MeshAnything V2 is an advanced mesh generation model designed to create Artist-Created Meshes.<n>A key innovation behind MeshAnything V2 is our novel Adjacent Mesh Tokenization (AMT) method.
arXiv Detail & Related papers (2024-08-05T15:33:45Z) - DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions [41.55908366474901]
We introduce a novel approach that harnesses both 2D and 3D attentions to enable highly accurate depth completion.
We evaluate our method, DeCoTR, on established depth completion benchmarks.
arXiv Detail & Related papers (2024-03-18T19:22:55Z) - Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - Sampling is Matter: Point-guided 3D Human Mesh Reconstruction [0.0]
This paper presents a simple yet powerful method for 3D human mesh reconstruction from a single RGB image.
Experimental results on benchmark datasets show that the proposed method efficiently improves the performance of 3D human mesh reconstruction.
arXiv Detail & Related papers (2023-04-19T08:45:26Z) - Multi-initialization Optimization Network for Accurate 3D Human Pose and
Shape Estimation [75.44912541912252]
We propose a three-stage framework named Multi-Initialization Optimization Network (MION)
In the first stage, we strategically select different coarse 3D reconstruction candidates which are compatible with the 2D keypoints of input sample.
In the second stage, we design a mesh refinement transformer (MRT) to respectively refine each coarse reconstruction result via a self-attention mechanism.
Finally, a Consistency Estimation Network (CEN) is proposed to find the best result from mutiple candidates by evaluating if the visual evidence in RGB image matches a given 3D reconstruction.
arXiv Detail & Related papers (2021-12-24T02:43:58Z) - Efficient 3D Deep LiDAR Odometry [16.388259779644553]
An efficient 3D point cloud learning architecture, named PWCLO-Net, is first proposed in this paper.
The entire architecture is holistically optimized end-to-end to achieve adaptive learning of cost volume and mask.
arXiv Detail & Related papers (2021-11-03T11:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.