Related papers: Rethinking Multimodal Point Cloud Completion: A Completion-by-Correction Perspective

Rethinking Multimodal Point Cloud Completion: A Completion-by-Correction Perspective

URL: http://arxiv.org/abs/2511.12170v1
Date: Sat, 15 Nov 2025 11:51:13 GMT
Title: Rethinking Multimodal Point Cloud Completion: A Completion-by-Correction Perspective
Authors: Wang Luo, Di Wu, Hengyuan Na, Yinlin Zhu, Miao Hu, Guocong Quan,
Abstract summary: Point cloud completion aims to reconstruct complete 3D shapes from partial observations.<n>Most methods still follow a Completion-by-Inpainting paradigm.<n>We propose Completion-by-Correction, which begins with a complete shape prior and performs feature-space correction to align it with the partial observation.
Score: 8.276620253870338
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Point cloud completion aims to reconstruct complete 3D shapes from partial observations, which is a challenging problem due to severe occlusions and missing geometry. Despite recent advances in multimodal techniques that leverage complementary RGB images to compensate for missing geometry, most methods still follow a Completion-by-Inpainting paradigm, synthesizing missing structures from fused latent features. We empirically show that this paradigm often results in structural inconsistencies and topological artifacts due to limited geometric and semantic constraints. To address this, we rethink the task and propose a more robust paradigm, termed Completion-by-Correction, which begins with a topologically complete shape prior generated by a pretrained image-to-3D model and performs feature-space correction to align it with the partial observation. This paradigm shifts completion from unconstrained synthesis to guided refinement, enabling structurally consistent and observation-aligned reconstruction. Building upon this paradigm, we introduce PGNet, a multi-stage framework that conducts dual-feature encoding to ground the generative prior, synthesizes a coarse yet structurally aligned scaffold, and progressively refines geometric details via hierarchical correction. Experiments on the ShapeNetViPC dataset demonstrate the superiority of PGNet over state-of-the-art baselines in terms of average Chamfer Distance (-23.5%) and F-score (+7.1%).

Related papers

Inverse Rendering for High-Genus 3D Surface Meshes from Multi-view Images with Persistent Homology Priors [11.227213428407673]
Reconstructing 3D objects from images is inherently an ill-posed problem due to ambiguities in geometry, appearance, and topology.<n>This paper introduces collaborative rendering with persistent homology priors, a novel strategy that leverages topological constraints to resolve these ambiguities.
arXiv Detail & Related papers (2026-01-17T20:06:19Z)
Joint Geometry-Appearance Human Reconstruction in a Unified Latent Space via Bridge Diffusion [57.09673862519791]
This paper introduces textbfJGA-LBD, a novel framework that unifies the modeling of geometry and appearance into a joint latent representation.<n> Experiments demonstrate that JGA-LBD outperforms current state-of-the-art approaches in terms of both geometry fidelity and appearance quality.
arXiv Detail & Related papers (2026-01-01T12:48:56Z)
ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing Points [32.23473666846317]
We propose ControlVP, a user-guided framework for correcting vanishing point inconsistencies in generated images.<n>Our approach extends a pre-trained diffusion model by incorporating structural guidance derived from building contours.<n>Our method enhances global geometric consistency while maintaining visual fidelity comparable to the baselines.
arXiv Detail & Related papers (2025-12-08T12:38:11Z)
TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction [57.46712611558817]
3D vision foundation models have shown strong generalization in reconstructing key 3D attributes from uncalibrated images through a single feed-forward pass.<n>Recent strategies align consecutive predictions by solving global transformation, yet our analysis reveals their fundamental limitations in assumption validity, local alignment scope, and robustness under noisy geometry.<n>We propose a higher-DOF and long-term alignment framework based on Thin Plate Spline, leveraging globally propagated control points to correct spatially varying inconsistencies.
arXiv Detail & Related papers (2025-12-02T02:22:20Z)
KineDiff3D: Kinematic-Aware Diffusion for Category-Level Articulated Object Shape Reconstruction and Generation [28.822034731724013]
Articulated objects, such as laptops and drawers, exhibit significant challenges for 3D reconstruction and pose estimation.<n>We propose KineDiff3D: Kinematic-Aware Diffusion for Category-Level Articulated Object Shape Reconstruction and Generation.
arXiv Detail & Related papers (2025-10-20T04:15:40Z)
G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior [53.762256749551284]
We identify accurate geometry as the fundamental prerequisite for effectively exploiting generative models to enhance 3D scene reconstruction.<n>We incorporate this geometry guidance throughout the generative pipeline to improve visibility mask estimation, guide novel view selection, and enhance multi-view consistency when inpainting with video diffusion models.<n>Our method naturally supports single-view inputs and unposed videos, with strong generalizability in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2025-10-14T03:06:28Z)
OracleGS: Grounding Generative Priors for Sparse-View Gaussian Splatting [78.70702961852119]
OracleGS reconciles generative completeness with regressive fidelity for sparse view Gaussian Splatting.<n>Our approach conditions the powerful generative prior on multi-view geometric evidence, filtering hallucinatory artifacts while preserving plausible completions in under-constrained regions.
arXiv Detail & Related papers (2025-09-27T11:19:32Z)
CAGE: Continuity-Aware edGE Network Unlocks Robust Floorplan Reconstruction [24.09888364478496]
We present CAGE, a robust framework for reconstructing vector floorplans directly from point-cloud density maps.<n>CAGE achieves state-of-the-art performance, with F1 scores of 99.1% (rooms), 91.7% (corners), and 89.3% (angles)
arXiv Detail & Related papers (2025-09-18T22:10:37Z)
Seeing 3D Through 2D Lenses: 3D Few-Shot Class-Incremental Learning via Cross-Modal Geometric Rectification [59.17489431187807]
We propose a framework that enhances 3D geometric fidelity by leveraging CLIP's hierarchical spatial semantics.<n>Our method significantly improves 3D few-shot class-incremental learning, achieving superior geometric coherence and robustness to texture bias.
arXiv Detail & Related papers (2025-09-18T13:45:08Z)
Matching Shapes Under Different Topologies: A Topology-Adaptive Deformation Guided Approach [6.851721795186258]
Non-rigid 3D mesh matching is a critical step in computer vision and computer graphics pipelines.<n>We tackle matching meshes that contain topological artefacts which can break the assumption made by current approaches.<n>We are motivated by real-world scenarios such as per-frame multi-view reconstructions, often suffering from topological artefacts.<n>We show that, while not relying on any data-driven prior, our approach applies to highly non-isometric shapes and shapes with topological artefacts, including noisy per-frame multi-view reconstructions, even outperforming methods trained on large datasets in 3D alignment quality.
arXiv Detail & Related papers (2025-09-08T16:29:44Z)
Adaptive Point-Prompt Tuning: Fine-Tuning Heterogeneous Foundation Models for 3D Point Cloud Analysis [51.37795317716487]
We propose the Adaptive Point-Prompt Tuning (APPT) method, which fine-tunes pre-trained models with a modest number of parameters.<n>We convert raw point clouds into point embeddings by aggregating local geometry to capture spatial features followed by linear layers.<n>To calibrate self-attention across source domains of any modality to 3D, we introduce a prompt generator that shares weights with the point embedding module.
arXiv Detail & Related papers (2025-08-30T06:02:21Z)
Dens3R: A Foundation Model for 3D Geometry Prediction [44.13431776180547]
Dens3R is a 3D foundation model designed for joint geometric dense prediction.<n>By integrating image-pair matching features with intrinsic invariance modeling, Dens3R accurately regresses multiple geometric quantities.
arXiv Detail & Related papers (2025-07-22T07:22:30Z)
SCORP: Scene-Consistent Object Refinement via Proxy Generation and Tuning [46.441761732998536]
We introduce Scene-Consistent Object Refinement via Proxy Generation and Tuning (SCORP)<n>SCORP is a novel 3D enhancement framework that leverages 3D generative priors to recover fine-grained object geometry and appearance under missing views.<n>It achieves consistent gains over recent state-of-the-art baselines on both novel view synthesis and geometry completion tasks.
arXiv Detail & Related papers (2025-06-30T13:26:21Z)
Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology. Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.