Fugu-MT 論文翻訳(概要): A Comparison of Multi-View Stereo Methods for Photogrammetric 3D Reconstruction: From Traditional to Learning-Based Approaches

論文の概要: A Comparison of Multi-View Stereo Methods for Photogrammetric 3D Reconstruction: From Traditional to Learning-Based Approaches

arxiv url: http://arxiv.org/abs/2604.10246v1
Date: Sat, 11 Apr 2026 15:02:08 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-14 20:13:15.922358
Title: A Comparison of Multi-View Stereo Methods for Photogrammetric 3D Reconstruction: From Traditional to Learning-Based Approaches
Title（参考訳）: 光ホロメトリック3次元再構成のための多視点ステレオ法の比較:伝統的アプローチから学習的アプローチへ
Authors: Yawen Li, George Vosselman, Francesco Nex,
Abstract要約: フォトグラムによる3D再構成は、長い間、Structure-from-Motion (SfM) 法とMulti-View Stereo (MVS) 法に依存してきた。近年,学習に基づくMVS手法が登場し,より高速で効率的な再構築を目指している。本研究は、代表的伝統的なMVSパイプライン(COLMAP)と最先端の学習ベースアプローチの比較評価を行う。
参考スコア（独自算出の注目度）: 9.65071161442607
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Photogrammetric 3D reconstruction has long relied on traditional Structure-from-Motion (SfM) and Multi-View Stereo (MVS) methods, which provide high accuracy but face challenges in speed and scalability. Recently, learning-based MVS methods have emerged, aiming for faster and more efficient reconstruction. This work presents a comparative evaluation between a representative traditional MVS pipeline (COLMAP) and state-of-the-art learning-based approaches, including geometry-guided methods (MVSNet, PatchmatchNet, MVSAnywhere, MVSFormer++) and end-to-end frameworks (Stereo4D, FoundationStereo, DUSt3R, MASt3R, Fast3R, VGGT). Two experiments were conducted on different aerial scenarios. The first experiment used the MARS-LVIG dataset, where ground-truth 3D reconstruction was provided by LiDAR point clouds. The second experiment used a public scene from the Pix4D official website, with ground truth generated by Pix4Dmapper. We evaluated accuracy, coverage, and runtime across all methods. Experimental results show that although COLMAP can provide reliable and geometrically consistent reconstruction results, it requires more computation time. In cases where traditional methods fail in image registration, learning-based approaches exhibit stronger feature-matching capability and greater robustness. Geometry-guided methods usually require careful dataset preparation and often depend on camera pose or depth priors generated by COLMAP. End-to-end methods such as DUSt3R and VGGT achieve competitive accuracy and reasonable coverage while offering substantially faster reconstruction. However, they exhibit relatively large residuals in 3D reconstruction, particularly in challenging scenarios.
Abstract（参考訳）: フォトグラムによる3D再構成は従来からのStructure-from-Motion (SfM) 法とMultiple-View Stereo (MVS) 法に依存してきた。近年,学習に基づくMVS手法が登場し,より高速で効率的な再構築を目指している。この研究は、一般的な伝統的なMVSパイプライン(COLMAP)と、幾何学誘導メソッド(MVSNet、PatchmatchNet、MVSAnywhere、MVSFormer++)とエンドツーエンドフレームワーク(Stereo4D、FoundationStereo、DUSt3R、MASt3R、Fast3R、VGGT)を含む最先端の学習ベースのアプローチの比較評価を行う。異なるシナリオで2つの実験が行われた。最初の実験ではMARS-LVIGデータセットを使用しており、そこではLDAR点雲によって地上の3D再構成が行われた。 2つ目の実験ではPix4Dの公式ウェブサイトから公開シーンを使用し、Pix4Dmapperによって真理が生み出された。すべてのメソッドで精度、カバレッジ、ランタイムを評価しました。実験結果から、COLMAPは信頼性が高く、幾何的に一貫した再構成結果を提供することができるが、より多くの計算時間を必要とすることがわかった。従来の手法が画像登録に失敗した場合、学習に基づくアプローチはより強力な特徴マッチング能力とより堅牢性を示す。幾何学誘導法は通常、慎重にデータセットを作成する必要があり、しばしばCOLMAPによって生成されたカメラのポーズや深さに依存する。 DUSt3RやVGGTのようなエンドツーエンドの手法は、かなり高速な再構築を提供しながら、競争の正確さと合理的なカバレッジを達成する。しかし,3次元再構成では比較的大きな残像がみられ,特に難解な症例では顕著である。

論文の概要: A Comparison of Multi-View Stereo Methods for Photogrammetric 3D Reconstruction: From Traditional to Learning-Based Approaches

関連論文リスト