Related papers: BridgeShape: Latent Diffusion Schrödinger Bridge for 3D Shape Completion

BridgeShape: Latent Diffusion Schrödinger Bridge for 3D Shape Completion

URL: http://arxiv.org/abs/2506.23205v1
Date: Sun, 29 Jun 2025 12:21:21 GMT
Title: BridgeShape: Latent Diffusion Schrödinger Bridge for 3D Shape Completion
Authors: Dequan Kong, Zhe Zhu, Honghua Chen, Mingqiang Wei,
Abstract summary: BridgeShape is a novel framework for 3D shape completion via latent diffusion Schr"odinger bridge.<n>We introduce a Depth-Enhanced Vector Quantized Variational Autoencoder (VQ-VAE) to encode 3D shapes into a compact latent space.<n>BridgeShape achieves state-of-the-art performance on large-scale 3D shape completion benchmarks.
Score: 20.704173763035488
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing diffusion-based 3D shape completion methods typically use a conditional paradigm, injecting incomplete shape information into the denoising network via deep feature interactions (e.g., concatenation, cross-attention) to guide sampling toward complete shapes, often represented by voxel-based distance functions. However, these approaches fail to explicitly model the optimal global transport path, leading to suboptimal completions. Moreover, performing diffusion directly in voxel space imposes resolution constraints, limiting the generation of fine-grained geometric details. To address these challenges, we propose BridgeShape, a novel framework for 3D shape completion via latent diffusion Schr\"odinger bridge. The key innovations lie in two aspects: (i) BridgeShape formulates shape completion as an optimal transport problem, explicitly modeling the transition between incomplete and complete shapes to ensure a globally coherent transformation. (ii) We introduce a Depth-Enhanced Vector Quantized Variational Autoencoder (VQ-VAE) to encode 3D shapes into a compact latent space, leveraging self-projected multi-view depth information enriched with strong DINOv2 features to enhance geometric structural perception. By operating in a compact yet structurally informative latent space, BridgeShape effectively mitigates resolution constraints and enables more efficient and high-fidelity 3D shape completion. BridgeShape achieves state-of-the-art performance on large-scale 3D shape completion benchmarks, demonstrating superior fidelity at higher resolutions and for unseen object classes.

Related papers

Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling [34.238349310770886]
We introduce Sparc3D, a unified framework that combines a sparse deformable marching cubes representation Sparcubes with a novel encoder Sparconv-VAE.<n>Sparc3D achieves state-of-the-art reconstruction fidelity on challenging inputs, including open surfaces, disconnected components, and intricate geometry.
arXiv Detail & Related papers (2025-05-20T15:44:54Z)
SparseFlex: High-Resolution and Arbitrary-Topology 3D Shape Modeling [79.56581753856452]
SparseFlex is a novel sparse-structured isosurface representation that enables differentiable mesh reconstruction at resolutions up to $10243$ directly from rendering losses.<n>By enabling high-resolution, differentiable mesh reconstruction and generation with rendering losses, SparseFlex significantly advances the state-of-the-art in 3D shape representation and modeling.
arXiv Detail & Related papers (2025-03-27T17:46:42Z)
Hyper3D: Efficient 3D Representation via Hybrid Triplane and Octree Feature for Enhanced 3D Shape Variational Auto-Encoders [43.61384205458698]
3D content generation pipelines often leverage Variational Autoencoders (VAEs) to encode shapes into compact latent representations.<n>We introduce Hyper3D, which enhances VAE reconstruction through efficient 3D representation that integrates hybrid triplane and octree features.<n> Experimental results demonstrate that Hyper3D outperforms traditional representations by reconstructing 3D shapes with higher fidelity and finer details.
arXiv Detail & Related papers (2025-03-13T14:26:43Z)
NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation [52.772319840580074]
3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints. Existing methods often decompose 3D shapes into a sequence of localized components, treating each element in isolation. We introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling.
arXiv Detail & Related papers (2024-03-27T04:09:34Z)
SC-Diff: 3D Shape Completion with Latent Diffusion Models [4.913210912019975]
This paper introduces a 3D shape completion approach using a 3D latent diffusion model optimized for completing shapes. Our method combines image-based conditioning through cross-attention and spatial conditioning through the integration of 3D features from captured partial scans.
arXiv Detail & Related papers (2024-03-19T06:01:11Z)
Robust 3D Tracking with Quality-Aware Shape Completion [67.9748164949519]
We propose a synthetic target representation composed of dense and complete point clouds depicting the target shape precisely by shape completion for robust 3D tracking. Specifically, we design a voxelized 3D tracking framework with shape completion, in which we propose a quality-aware shape completion mechanism to alleviate the adverse effect of noisy historical predictions.
arXiv Detail & Related papers (2023-12-17T04:50:24Z)
Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models [83.35835521670955]
Surf-D is a novel method for generating high-quality 3D shapes as Surfaces with arbitrary topologies. We use the Unsigned Distance Field (UDF) as our surface representation to accommodate arbitrary topologies. We also propose a new pipeline that employs a point-based AutoEncoder to learn a compact and continuous latent space for accurately encoding UDF.
arXiv Detail & Related papers (2023-11-28T18:56:01Z)
DeFormer: Integrating Transformers with Deformable Models for 3D Shape Abstraction from a Single Image [31.154786931081087]
We propose a novel bi-channel Transformer architecture, integrated with parameterized deformable models, to simultaneously estimate the global and local deformations of primitives. DeFormer achieves better reconstruction accuracy over the state-of-the-art, and visualizes with consistent semantic correspondences for improved interpretability.
arXiv Detail & Related papers (2023-09-22T02:46:43Z)
DiffComplete: Diffusion-based Generative 3D Shape Completion [114.43353365917015]
We introduce a new diffusion-based approach for shape completion on 3D range scans. We strike a balance between realism, multi-modality, and high fidelity. DiffComplete sets a new SOTA performance on two large-scale 3D shape completion benchmarks.
arXiv Detail & Related papers (2023-06-28T16:07:36Z)
3DQD: Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process [32.3773514247982]
We develop a generalized 3D shape generation prior model tailored for multiple 3D tasks. Designs jointly equip our proposed 3D shape prior model with high-fidelity, diverse features as well as the capability of cross-modality alignment.
arXiv Detail & Related papers (2023-03-18T12:50:29Z)
Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis [90.26556260531707]
DMTet is a conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels. Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology.
arXiv Detail & Related papers (2021-11-08T05:29:35Z)
Deep Implicit Templates for 3D Shape Representation [70.9789507686618]
We propose a new 3D shape representation that supports explicit correspondence reasoning in deep implicit representations. Our key idea is to formulate DIFs as conditional deformations of a template implicit function. We show that our method can not only learn a common implicit template for a collection of shapes, but also establish dense correspondences across all the shapes simultaneously without any supervision.
arXiv Detail & Related papers (2020-11-30T06:01:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.