GeoFusionLRM: Geometry-Aware Self-Correction for Consistent 3D Reconstruction
- URL: http://arxiv.org/abs/2602.14119v1
- Date: Sun, 15 Feb 2026 12:39:04 GMT
- Title: GeoFusionLRM: Geometry-Aware Self-Correction for Consistent 3D Reconstruction
- Authors: Ahmet Burak Yildirim, Tuna Saygin, Duygu Ceylan, Aysegul Dundar,
- Abstract summary: Single-image 3D reconstruction with large reconstruction models (LRMs) has advanced rapidly, yet reconstructions often exhibit geometric inconsistencies and details that limit fidelity.<n>We introduce GeoFusionLRM, a geometry-aware self-correction framework that leverages the model's own normal and depth predictions to refine structural accuracy.
- Score: 27.169882738788257
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Single-image 3D reconstruction with large reconstruction models (LRMs) has advanced rapidly, yet reconstructions often exhibit geometric inconsistencies and misaligned details that limit fidelity. We introduce GeoFusionLRM, a geometry-aware self-correction framework that leverages the model's own normal and depth predictions to refine structural accuracy. Unlike prior approaches that rely solely on features extracted from the input image, GeoFusionLRM feeds back geometric cues through a dedicated transformer and fusion module, enabling the model to correct errors and enforce consistency with the conditioning image. This design improves the alignment between the reconstructed mesh and the input views without additional supervision or external signals. Extensive experiments demonstrate that GeoFusionLRM achieves sharper geometry, more consistent normals, and higher fidelity than state-of-the-art LRM baselines.
Related papers
- KaoLRM: Repurposing Pre-trained Large Reconstruction Models for Parametric 3D Face Reconstruction [51.67605823241639]
KaoLRM re-targets the learned prior of the Large Reconstruction Model (LRM) for parametric 3D face reconstruction from single-view images.<n> Experiments on both controlled and in-the-wild benchmarks demonstrate that KaoLRM achieves superior reconstruction accuracy and cross-view consistency.
arXiv Detail & Related papers (2026-01-19T05:36:59Z) - Joint Geometry-Appearance Human Reconstruction in a Unified Latent Space via Bridge Diffusion [57.09673862519791]
This paper introduces textbfJGA-LBD, a novel framework that unifies the modeling of geometry and appearance into a joint latent representation.<n> Experiments demonstrate that JGA-LBD outperforms current state-of-the-art approaches in terms of both geometry fidelity and appearance quality.
arXiv Detail & Related papers (2026-01-01T12:48:56Z) - GACO-CAD: Geometry-Augmented and Conciseness-Optimized CAD Model Generation from Single Image [11.612167656421079]
Multi-modal large language models (MLLMs) still struggle with accurately inferring 3D geometry from 2D images.<n>We introduce GACO-CAD, a novel two-stage post-training framework.<n>Experiments on the DeepCAD and Fusion360 datasets show that GACO-CAD achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-10-20T04:57:20Z) - A 3DGS-Diffusion Self-Supervised Framework for Normal Estimation from a Single Image [5.588610465556571]
The lack of spatial dimensional information remains a challenge in normal estimation from a single image.<n>Recent diffusion-based methods have demonstrated significant potential in 2D-to-3D implicit mapping.<n>This paper proposes SINGAD, a novel Self-supervised framework from a single Image for Normal estimation.
arXiv Detail & Related papers (2025-08-08T02:32:33Z) - GeoAda: Efficiently Finetune Geometric Diffusion Models with Equivariant Adapters [61.51810815162003]
We propose an SE(3)-equivariant adapter framework ( GeoAda) that enables flexible and parameter-efficient fine-tuning for controlled generative tasks.<n>GeoAda preserves the model's geometric consistency while mitigating overfitting and catastrophic forgetting.<n>We demonstrate the wide applicability of GeoAda across diverse geometric control types, including frame control, global control, subgraph control, and a broad range of application domains.
arXiv Detail & Related papers (2025-07-02T18:44:03Z) - Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization [27.509109317973817]
3D Gaussian Splatting (3DGS) has garnered significant attention for its high-quality rendering and fast inference speed.<n>Previous methods primarily focus on geometry regularization, with common approaches including primitive-based and dual-model frameworks.<n>We propose CarGS, a unified model leveraging-adaptive regularization to achieve simultaneous, high-quality surface reconstruction.
arXiv Detail & Related papers (2025-03-02T12:51:38Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - MeshLRM: Large Reconstruction Model for High-Quality Meshes [52.71164862539288]
MeshLRM can reconstruct a high-quality mesh from merely four input images in less than one second.<n>Our approach achieves state-of-the-art mesh reconstruction from sparse-view inputs and also allows for many downstream applications.
arXiv Detail & Related papers (2024-04-18T17:59:41Z) - GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction
Model [37.75256020559125]
We present a high-fidelity feed-forward single image-to-3D generative model.
We highlight the necessity of integrating geometric priors into network design.
Our model delivers a high-fidelity textured mesh from an image in just 10 seconds, without any test-time optimization.
arXiv Detail & Related papers (2024-03-08T04:25:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.