Test3R: Learning to Reconstruct 3D at Test Time
- URL: http://arxiv.org/abs/2506.13750v1
- Date: Mon, 16 Jun 2025 17:56:22 GMT
- Title: Test3R: Learning to Reconstruct 3D at Test Time
- Authors: Yuheng Yuan, Qiuhong Shen, Shizun Wang, Xingyi Yang, Xinchao Wang,
- Abstract summary: Test3R is a surprisingly simple test-time learning technique that significantly boosts geometric accuracy.<n>Our technique significantly outperforms previous state-of-the-art methods on the 3D reconstruction and multi-view depth estimation tasks.
- Score: 58.0912500917036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dense matching methods like DUSt3R regress pairwise pointmaps for 3D reconstruction. However, the reliance on pairwise prediction and the limited generalization capability inherently restrict the global geometric consistency. In this work, we introduce Test3R, a surprisingly simple test-time learning technique that significantly boosts geometric accuracy. Using image triplets ($I_1,I_2,I_3$), Test3R generates reconstructions from pairs ($I_1,I_2$) and ($I_1,I_3$). The core idea is to optimize the network at test time via a self-supervised objective: maximizing the geometric consistency between these two reconstructions relative to the common image $I_1$. This ensures the model produces cross-pair consistent outputs, regardless of the inputs. Extensive experiments demonstrate that our technique significantly outperforms previous state-of-the-art methods on the 3D reconstruction and multi-view depth estimation tasks. Moreover, it is universally applicable and nearly cost-free, making it easily applied to other models and implemented with minimal test-time training overhead and parameter footprint. Code is available at https://github.com/nopQAQ/Test3R.
Related papers
- Dens3R: A Foundation Model for 3D Geometry Prediction [44.13431776180547]
Dens3R is a 3D foundation model designed for joint geometric dense prediction.<n>By integrating image-pair matching features with intrinsic invariance modeling, Dens3R accurately regresses multiple geometric quantities.
arXiv Detail & Related papers (2025-07-22T07:22:30Z) - 3D Face Reconstruction Error Decomposed: A Modular Benchmark for Fair and Fast Method Evaluation [30.625439879741847]
We present a toolkit for a Modularized 3D Face reconstruction Benchmark (M3DFB)<n>The fundamental components of error are segregated and interchangeable, allowing one to quantify the effect of each.<n>We propose a new component, namely correction, and present a computationally efficient approach that penalizes for mesh topology inconsistency.
arXiv Detail & Related papers (2025-05-23T15:28:43Z) - SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images [49.7344030427291]
We study the problem of single-image 3D object reconstruction.<n>Recent works have diverged into two directions: regression-based modeling and generative modeling.<n>We present SPAR3D, a novel two-stage approach aiming to take the best of both directions.
arXiv Detail & Related papers (2025-01-08T18:52:03Z) - Implicit Shape and Appearance Priors for Few-Shot Full Head Reconstruction [16.082960967820036]
In this paper, we address the problem of few-shot full 3D head reconstruction.<n>We accomplish this by incorporating a probabilistic shape and appearance prior into coordinate-based representations.<n>We extend the H3DS dataset, which now comprises 60 high-resolution 3D full head scans and their corresponding posed images and masks.
arXiv Detail & Related papers (2023-10-12T07:35:30Z) - Neural Refinement for Absolute Pose Regression with Feature Synthesis [33.2608395824548]
Absolute Pose Regression (APR) methods use deep neural networks to directly regress camera poses from RGB images.
In this work, we propose a test-time refinement pipeline that leverages implicit geometric constraints.
We also introduce a novel Neural Feature Synthesizer (NeFeS) model, which encodes 3D geometric features during training and directly renders dense novel view features at test time to refine APR methods.
arXiv Detail & Related papers (2023-03-17T16:10:50Z) - Multi-initialization Optimization Network for Accurate 3D Human Pose and
Shape Estimation [75.44912541912252]
We propose a three-stage framework named Multi-Initialization Optimization Network (MION)
In the first stage, we strategically select different coarse 3D reconstruction candidates which are compatible with the 2D keypoints of input sample.
In the second stage, we design a mesh refinement transformer (MRT) to respectively refine each coarse reconstruction result via a self-attention mechanism.
Finally, a Consistency Estimation Network (CEN) is proposed to find the best result from mutiple candidates by evaluating if the visual evidence in RGB image matches a given 3D reconstruction.
arXiv Detail & Related papers (2021-12-24T02:43:58Z) - Warp Consistency for Unsupervised Learning of Dense Correspondences [116.56251250853488]
Key challenge in learning dense correspondences is lack of ground-truth matches for real image pairs.
We propose Warp Consistency, an unsupervised learning objective for dense correspondence regression.
Our approach sets a new state-of-the-art on several challenging benchmarks, including MegaDepth, RobotCar and TSS.
arXiv Detail & Related papers (2021-04-07T17:58:22Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Learning Deformable Tetrahedral Meshes for 3D Reconstruction [78.0514377738632]
3D shape representations that accommodate learning-based 3D reconstruction are an open problem in machine learning and computer graphics.
Previous work on neural 3D reconstruction demonstrated benefits, but also limitations, of point cloud, voxel, surface mesh, and implicit function representations.
We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes volumetric tetrahedral meshes for the reconstruction problem.
arXiv Detail & Related papers (2020-11-03T02:57:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.