Related papers: PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud

PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud

URL: http://arxiv.org/abs/2406.15811v3
Date: Tue, 12 Aug 2025 13:31:36 GMT
Title: PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud
Authors: Qiao Yu, Xianzhi Li, Yuan Tang, Xu Han, Jinfeng Xu, Long Hu, Min Chen,
Abstract summary: We propose PointDreamer, a novel framework that harnesses 2D diffusion prior for superior texture quality.<n>PointDreamer, though zero-shot, exhibits SoTA performance (30% improvement on LPIPS score from 0.118 to 0.068), and is robust to noisy, sparse, or even incomplete input data.
Score: 21.224318910715777
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Faithfully reconstructing textured meshes is crucial for many applications. Compared to text or image modalities, leveraging 3D colored point clouds as input (colored-PC-to-mesh) offers inherent advantages in comprehensively and precisely replicating the target object's 360{\deg} characteristics. While most existing colored-PC-to-mesh methods suffer from blurry textures or require hard-to-acquire 3D training data, we propose PointDreamer, a novel framework that harnesses 2D diffusion prior for superior texture quality. Crucially, unlike prior 2D-diffusion-for-3D works driven by text or image inputs, PointDreamer successfully adapts 2D diffusion models to 3D point cloud data by a novel project-inpaint-unproject pipeline. Specifically, it first projects the point cloud into sparse 2D images and then performs diffusion-based inpainting. After that, diverging from most existing 3D reconstruction or generation approaches that predict texture in 3D/UV space thus often yielding blurry texture, PointDreamer achieves high-quality texture by directly unprojecting the inpainted 2D images to the 3D mesh. Furthermore, we identify for the first time a typical kind of unprojection artifact appearing in occlusion borders, which is common in other multiview-image-to-3D pipelines but less-explored. To address this, we propose a novel solution named the Non-Border-First (NBF) unprojection strategy. Extensive qualitative and quantitative experiments on various synthetic and real-scanned datasets demonstrate that PointDreamer, though zero-shot, exhibits SoTA performance (30% improvement on LPIPS score from 0.118 to 0.068), and is robust to noisy, sparse, or even incomplete input data. Code at: https://github.com/YuQiao0303/PointDreamer.

Related papers

Category-Aware 3D Object Composition with Disentangled Texture and Shape Multi-view Diffusion [31.888133775976414]
We tackle a new task of 3D object synthesis, where a 3D model is composited with another object category to create a novel 3D model.<n>Most existing text/image/3D-to-3D methods struggle to effectively integrate multiple content sources.<n>We propose category+3D-to-3D (C33D), for generating novel and structurally coherent 3D models.
arXiv Detail & Related papers (2025-09-02T14:19:21Z)
TEXGen: a Generative Diffusion Model for Mesh Textures [63.43159148394021]
We focus on the fundamental problem of learning in the UV texture space itself. We propose a scalable network architecture that interleaves convolutions on UV maps with attention layers on point clouds. We train a 700 million parameter diffusion model that can generate UV texture maps guided by text prompts and single-view images.
arXiv Detail & Related papers (2024-11-22T05:22:11Z)
DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation [149.77077125310805]
We present DreamMesh, a novel text-to-3D architecture that pivots on well-defined surfaces (triangle meshes) to generate high-fidelity explicit 3D model. In the coarse stage, the mesh is first deformed by text-guided Jacobians and then DreamMesh textures the mesh with an interlaced use of 2D diffusion models. In the fine stage, DreamMesh jointly manipulates the mesh and refines the texture map, leading to high-quality triangle meshes with high-fidelity textured materials.
arXiv Detail & Related papers (2024-09-11T17:59:02Z)
GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting [11.60605616190011]
We propose a novel 2D-3D hybrid colored point cloud upsampling framework (GaussianPU) based on 3D Gaussian Splatting (3DGS) for robotic perception. A dual scale rendered image restoration network transforms sparse point cloud renderings into dense representations. We have made a series of enhancements to the vanilla 3DGS, enabling precise control over the number of points.
arXiv Detail & Related papers (2024-09-03T03:35:04Z)
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts. Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets. We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z)
LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image [64.94932577552458]
Large Reconstruction Models have made significant strides in the realm of automated 3D content generation from single or multiple input images. Despite their success, these models often produce 3D meshes with geometric inaccuracies, stemming from the inherent challenges of deducing 3D shapes solely from image data. We introduce a novel framework, the Large Image and Point Cloud Alignment Model (LAM3D), which utilizes 3D point cloud data to enhance the fidelity of generated 3D meshes.
arXiv Detail & Related papers (2024-05-24T15:09:12Z)
Consistent Mesh Diffusion [8.318075237885857]
Given a 3D mesh with a UV parameterization, we introduce a novel approach to generating textures from text prompts. We demonstrate our approach on a dataset containing 30 meshes, taking approximately 5 minutes per mesh.
arXiv Detail & Related papers (2023-12-01T23:25:14Z)
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes [52.31402192831474]
Existing 3D scene generation models, however, limit the target scene to specific domain. We propose LucidDreamer, a domain-free scene generation pipeline. LucidDreamer produces highly-detailed Gaussian splats with no constraint on domain of the target scene.
arXiv Detail & Related papers (2023-11-22T13:27:34Z)
Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors [104.79392615848109]
We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes from a single unposed image. In the first stage, we optimize a neural radiance field to produce a coarse geometry. In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing texture.
arXiv Detail & Related papers (2023-06-30T17:59:08Z)
Point2Pix: Photo-Realistic Point Cloud Rendering via Neural Radiance Fields [63.21420081888606]
Recent Radiance Fields and extensions are proposed to synthesize realistic images from 2D input. We present Point2Pix as a novel point to link the 3D sparse point clouds with 2D dense image pixels.
arXiv Detail & Related papers (2023-03-29T06:26:55Z)
Deep Hybrid Self-Prior for Full 3D Mesh Generation [57.78562932397173]
We propose to exploit a novel hybrid 2D-3D self-prior in deep neural networks to significantly improve the geometry quality. In particular, we first generate an initial mesh using a 3D convolutional neural network with 3D self-prior, and then encode both 3D information and color information in the 2D UV atlas. Our method recovers the 3D textured mesh model of high quality from sparse input, and outperforms the state-of-the-art methods in terms of both the geometry and texture quality.
arXiv Detail & Related papers (2021-08-18T07:44:21Z)
SE-MD: A Single-encoder multiple-decoder deep network for point cloud generation from 2D images [2.4087148947930634]
3D model generation from single 2D RGB images is a challenging and actively researched computer vision task. There are various issues like using inefficient 3D representation formats, weak 3D model generation backbones, inability to generate dense point clouds. A novel 2D RGB image to point cloud conversion technique is proposed, which improves the state of art in the field.
arXiv Detail & Related papers (2021-06-17T10:48:46Z)
ParaNet: Deep Regular Representation for 3D Point Clouds [62.81379889095186]
ParaNet is a novel end-to-end deep learning framework for representing 3D point clouds. It converts an irregular 3D point cloud into a regular 2D color image, named point geometry image (PGI) In contrast to conventional regular representation modalities based on multi-view projection and voxelization, the proposed representation is differentiable and reversible.
arXiv Detail & Related papers (2020-12-05T13:19:55Z)
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes [93.82668222075128]
We propose a 3D detection architecture called ImVoteNet for RGB-D scenes. ImVoteNet is based on fusing 2D votes in images and 3D votes in point clouds. We validate our model on the challenging SUN RGB-D dataset.
arXiv Detail & Related papers (2020-01-29T05:09:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.