Related papers: MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

URL: http://arxiv.org/abs/2406.10163v2
Date: Wed, 09 Oct 2024 05:00:16 GMT
Title: MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
Authors: Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang,
Abstract summary: We introduce MeshAnything, a model that treats mesh extraction as a generation problem. By converting 3D assets in any 3D representation into AMs, MeshAnything can be integrated with various 3D asset production methods. Our method generates AMs with hundreds of times fewer faces, significantly improving storage, rendering, and simulation efficiencies.
Score: 76.70891862458384
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, 3D assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. However, this potential is largely unrealized because these assets always need to be converted to meshes for 3D industry applications, and the meshes produced by current mesh extraction methods are significantly inferior to Artist-Created Meshes (AMs), i.e., meshes created by human artists. Specifically, current mesh extraction methods rely on dense faces and ignore geometric features, leading to inefficiencies, complicated post-processing, and lower representation quality. To address these issues, we introduce MeshAnything, a model that treats mesh extraction as a generation problem, producing AMs aligned with specified shapes. By converting 3D assets in any 3D representation into AMs, MeshAnything can be integrated with various 3D asset production methods, thereby enhancing their application across the 3D industry. The architecture of MeshAnything comprises a VQ-VAE and a shape-conditioned decoder-only transformer. We first learn a mesh vocabulary using the VQ-VAE, then train the shape-conditioned decoder-only transformer on this vocabulary for shape-conditioned autoregressive mesh generation. Our extensive experiments show that our method generates AMs with hundreds of times fewer faces, significantly improving storage, rendering, and simulation efficiencies, while achieving precision comparable to previous methods.

Related papers

MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs [79.45006864728893]
MeshCraft is a framework for efficient and controllable mesh generation. It uses continuous spatial diffusion to generate discrete triangle faces. It can generate an 800-face mesh in just 3.2 seconds.
arXiv Detail & Related papers (2025-03-29T09:21:50Z)
ConvMesh: Reimagining Mesh Quality Through Convex Optimization [55.2480439325792]
This research introduces a convex optimization programming called disciplined convex programming to enhance existing meshes. By focusing on a sparse set of point clouds from both the original and target meshes, this method demonstrates significant improvements in mesh quality with minimal data requirements.
arXiv Detail & Related papers (2024-12-11T15:48:25Z)
Structured 3D Latents for Scalable and Versatile 3D Generation [28.672494137267837]
We introduce a novel 3D generation method for versatile and high-quality 3D asset creation. The cornerstone is a unified Structured LATent representation which allows decoding to different output formats. This is achieved by integrating a sparsely-populated 3D grid with dense multiview visual features extracted from a powerful vision foundation model.
arXiv Detail & Related papers (2024-12-02T13:58:38Z)
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images [28.302030751098354]
StdGEN is an innovative pipeline for generating semantically high-quality 3D characters from single images. It generates intricately detailed 3D characters with separated semantic components such as the body, clothes, and hair, in three minutes. StdGEN offers ready-to-use semantic-decomposed 3D characters and enables flexible customization for a wide range of applications.
arXiv Detail & Related papers (2024-11-08T17:54:18Z)
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion [86.25111098482537]
We introduce 3DTopia-XL, a scalable native 3D generative model designed to overcome limitations of existing methods. 3DTopia-XL leverages a novel primitive-based 3D representation, PrimX, which encodes detailed shape, albedo, and material field into a compact tensorial format. On top of the novel representation, we propose a generative framework based on Diffusion Transformer (DiT), which comprises 1) Primitive Patch Compression, 2) and Latent Primitive Diffusion. We conduct extensive qualitative and quantitative experiments to demonstrate that 3DTopia-XL significantly outperforms existing methods in generating high-
arXiv Detail & Related papers (2024-09-19T17:59:06Z)
PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers [5.7181794813117754]
PASTA is an autoregressive transformer architecture for generating high quality 3D shapes. Our model generates 3D shapes that are both more realistic and diverse than existing part-based and non part-based methods.
arXiv Detail & Related papers (2024-07-18T16:52:45Z)
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches. MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z)
LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image [64.94932577552458]
Large Reconstruction Models have made significant strides in the realm of automated 3D content generation from single or multiple input images. Despite their success, these models often produce 3D meshes with geometric inaccuracies, stemming from the inherent challenges of deducing 3D shapes solely from image data. We introduce a novel framework, the Large Image and Point Cloud Alignment Model (LAM3D), which utilizes 3D point cloud data to enhance the fidelity of generated 3D meshes.
arXiv Detail & Related papers (2024-05-24T15:09:12Z)
CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner [34.78919665494048]
CraftsMan can generate high-fidelity 3D geometries with highly varied shapes, regular mesh topologies, and detailed surfaces. Our method achieves high efficacy in producing superior-quality 3D assets compared to existing methods.
arXiv Detail & Related papers (2024-05-23T18:30:12Z)
DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation [53.20147419879056]
We introduce a diffusion-based feed-forward framework to address challenges with a single model. Building upon our 3D-aware Diffusion model with TransFormer, we propose a stronger version for 3D generation, i.e., DiffTF++. Experiments on ShapeNet and OmniObject3D convincingly demonstrate the effectiveness of our proposed modules.
arXiv Detail & Related papers (2024-05-13T17:59:51Z)
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space. We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.