Related papers: AMB3R: Accurate Feed-forward Metric-scale 3D Reconstruction with Backend

AMB3R: Accurate Feed-forward Metric-scale 3D Reconstruction with Backend

URL: http://arxiv.org/abs/2511.20343v1
Date: Tue, 25 Nov 2025 14:23:04 GMT
Title: AMB3R: Accurate Feed-forward Metric-scale 3D Reconstruction with Backend
Authors: Hengyi Wang, Lourdes Agapito,
Abstract summary: AMB3R is a feed-forward model for dense 3D reconstruction on a metric-scale.<n>We show that AMB3R can be seamlessly extended to uncalibrated visual odometry (online) or large-scale structure from motion.
Score: 18.645700170943975
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present AMB3R, a multi-view feed-forward model for dense 3D reconstruction on a metric-scale that addresses diverse 3D vision tasks. The key idea is to leverage a sparse, yet compact, volumetric scene representation as our backend, enabling geometric reasoning with spatial compactness. Although trained solely for multi-view reconstruction, we demonstrate that AMB3R can be seamlessly extended to uncalibrated visual odometry (online) or large-scale structure from motion without the need for task-specific fine-tuning or test-time optimization. Compared to prior pointmap-based models, our approach achieves state-of-the-art performance in camera pose, depth, and metric-scale estimation, 3D reconstruction, and even surpasses optimization-based SLAM and SfM methods with dense reconstruction priors on common benchmarks.

Related papers

S-MUSt3R: Sliding Multi-view 3D Reconstruction [17.018626984951823]
This work proposes S-MUSt3R, a simple and efficient pipeline that extends the limits of foundation models for monocular 3D reconstruction.<n>We show that S-MUSt3R runs successfully on long RGB sequences and produces accurate and consistent 3D reconstruction.
arXiv Detail & Related papers (2026-02-04T13:07:14Z)
AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance [36.125573065910594]
Active 3D reconstruction enables an agent to autonomously select viewpoints to obtain accurate and complete scene geometry.<n>We propose AREA3D, an active reconstruction agent that leverages feed-forward 3D reconstruction models and vision-language guidance.<n>Our framework decouples view-uncertainty modeling from the underlying feed-forward reconstructor, enabling precise uncertainty estimation without expensive online optimization.
arXiv Detail & Related papers (2025-11-28T06:17:02Z)
LARM: A Large Articulated-Object Reconstruction Model [29.66486888001511]
LARM is a unified feedforward framework that reconstructs 3D articulated objects from sparse-view images.<n>LARM generates auxiliary outputs such as depth maps and part masks to facilitate explicit 3D mesh extraction and joint estimation.<n>Our pipeline eliminates the need for dense supervision and supports high-fidelity reconstruction across diverse object categories.
arXiv Detail & Related papers (2025-11-14T18:55:27Z)
MoRE: 3D Visual Geometry Reconstruction Meets Mixture-of-Experts [50.37005070020306]
MoRE is a dense 3D visual foundation model based on a Mixture-of-Experts (MoE) architecture.<n>MoRE incorporates a confidence-based depth refinement module that stabilizes and refines geometric estimation.<n>It integrates dense semantic features with globally aligned 3D backbone representations for high-fidelity surface normal prediction.
arXiv Detail & Related papers (2025-10-31T06:54:27Z)
MapAnything: Universal Feed-Forward Metric 3D Reconstruction [63.79151976126576]
MapAnything ingests one or more images along with optional geometric inputs such as camera intrinsics, poses, depth, or partial reconstructions.<n>It then directly regresses the metric 3D scene geometry and cameras.<n>MapAnything addresses a broad range of 3D vision tasks in a single feed-forward pass.
arXiv Detail & Related papers (2025-09-16T18:00:14Z)
3D Reconstruction via Incremental Structure From Motion [1.4999444543328293]
We present a detailed implementation of the incremental SfM pipeline, focusing on the consistency of geometric estimation and the effect of iterative refinement through bundle adjustment.<n>Results support the practical utility of incremental SfM as a reliable method for sparse 3D reconstruction in visually structured environments.
arXiv Detail & Related papers (2025-08-01T18:45:05Z)
GTR: Gaussian Splatting Tracking and Reconstruction of Unknown Objects Based on Appearance and Geometric Complexity [49.31257173003408]
We present a novel method for 6-DoF object tracking and high-quality 3D reconstruction from monocular RGBD video.<n>Our approach demonstrates strong capabilities in recovering high-fidelity object meshes, setting a new standard for single-sensor 3D reconstruction in open-world environments.
arXiv Detail & Related papers (2025-05-17T08:46:29Z)
MUSt3R: Multi-view Network for Stereo 3D Reconstruction [11.61182864709518]
We propose an extension of DUSt3R from pairs to multiple views, that addresses all aforementioned concerns.<n>We entail the model with a multi-layer memory mechanism which allows to reduce the computational complexity.<n>The framework is designed to perform 3D reconstruction both offline and online, and hence can be seamlessly applied to SfM and visual SLAM scenarios.
arXiv Detail & Related papers (2025-03-03T15:36:07Z)
FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views [100.45129752375658]
We present FLARE, a feed-forward model designed to infer high-quality camera poses and 3D geometry from uncalibrated sparse-view images.<n>Our solution features a cascaded learning paradigm with camera pose serving as the critical bridge, recognizing its essential role in mapping 3D structures onto 2D image planes.
arXiv Detail & Related papers (2025-02-17T18:54:05Z)
FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction. Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z)
Towards Scalable Multi-View Reconstruction of Geometry and Materials [27.660389147094715]
We propose a novel method for joint recovery of camera pose, object geometry and spatially-varying Bidirectional Reflectance Distribution Function (svBRDF) of 3D scenes. The input are high-resolution RGBD images captured by a mobile, hand-held capture system with point lights for active illumination.
arXiv Detail & Related papers (2023-06-06T15:07:39Z)
Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth [90.33296913575818]
In some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency. We propose a locally weighted linear regression method to recover the scale and shift with very sparse anchor points. Our method can boost the performance of existing state-of-the-art approaches by 50% at most over several zero-shot benchmarks.
arXiv Detail & Related papers (2022-02-03T08:52:54Z)
Learning to Detect 3D Reflection Symmetry for Single-View Reconstruction [32.14605731030579]
3D reconstruction from a single RGB image is a challenging problem in computer vision. Previous methods are usually solely data-driven, which lead to inaccurate 3D shape recovery and limited generalization capability. We present a geometry-based end-to-end deep learning framework that first detects the mirror plane of reflection symmetry that commonly exists in man-made objects and then predicts depth maps by finding the intra-image pixel-wise correspondence of the symmetry.
arXiv Detail & Related papers (2020-06-17T17:58:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.