Fugu-MT 論文翻訳(概要): LoMa: Local Feature Matching Revisited

論文の概要: LoMa: Local Feature Matching Revisited

arxiv url: http://arxiv.org/abs/2604.04931v1
Date: Mon, 06 Apr 2026 17:59:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-07 15:49:19.338357
Title: LoMa: Local Feature Matching Revisited
Title（参考訳）: LoMa: ローカル機能マッチングが再検討
Authors: David Nordström, Johan Edstedt, Georg Bökman, Jonathan Astermark, Anders Heyden, Viktor Larsson, Mårten Wadenbäck, Michael Felsberg, Fredrik Kahl,
Abstract要約: 局所的特徴マッチングは、Structure-from-Motion (SfM) のような3次元視覚システムの基本コンポーネントとして長い間使われてきた。本稿では,データ駆動の観点から局所的特徴マッチングを再考する。大規模で多様なデータミックス、現代的なトレーニングレシピ、スケールされたモデルキャパシティ、スケールされた計算を組み合わせることで、パフォーマンスが著しく向上します。
参考スコア（独自算出の注目度）: 56.73318466794448
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Local feature matching has long been a fundamental component of 3D vision systems such as Structure-from-Motion (SfM), yet progress has lagged behind the rapid advances of modern data-driven approaches. The newer approaches, such as feed-forward reconstruction models, have benefited extensively from scaling dataset sizes, whereas local feature matching models are still only trained on a few mid-sized datasets. In this paper, we revisit local feature matching from a data-driven perspective. In our approach, which we call LoMa, we combine large and diverse data mixtures, modern training recipes, scaled model capacity, and scaled compute, resulting in remarkable gains in performance. Since current standard benchmarks mainly rely on collecting sparse views from successful 3D reconstructions, the evaluation of progress in feature matching has been limited to relatively easy image pairs. To address the resulting saturation of benchmarks, we collect 1000 highly challenging image pairs from internet data into a new dataset called HardMatch. Ground truth correspondences for HardMatch are obtained via manual annotation by the authors. In our extensive benchmarking suite, we find that LoMa makes outstanding progress across the board, outperforming the state-of-the-art method ALIKED+LightGlue by +18.6 mAA on HardMatch, +29.5 mAA on WxBS, +21.4 (1m, 10$^\circ$) on InLoc, +24.2 AUC on RUBIK, and +12.4 mAA on IMC 2022. We release our code and models publicly at https://github.com/davnords/LoMa.
Abstract（参考訳）: 局所的な特徴マッチングは、Structure-from-Motion (SfM) のような3D視覚システムの基本コンポーネントとして長い間使われてきたが、現代のデータ駆動アプローチの急速な進歩に遅れを取っている。フィードフォワード再構成モデルのような新しいアプローチは、データセットサイズをスケールすることから大きな恩恵を受けているが、ローカルな特徴マッチングモデルは、まだいくつかの中規模データセットでのみトレーニングされている。本稿では,データ駆動の観点から局所的特徴マッチングを再考する。 LoMaと呼ばれるアプローチでは、大規模で多様なデータミックス、現代的なトレーニングレシピ、スケールされたモデルキャパシティ、スケールされた計算を組み合わせることで、パフォーマンスが著しく向上します。現在の標準ベンチマークは、主に3次元再構成の成功によるスパースビューの収集に依存しているため、特徴マッチングの進捗評価は比較的容易な画像ペアに限られている。ベンチマークの結果の飽和に対処するために、インターネットデータから1000の非常に困難なイメージペアを、HardMatchと呼ばれる新しいデータセットに収集する。 HardMatchの接地真理対応は、著者の手による注釈によって得られる。我々の広範なベンチマークスイートでは、LoMaはボード全体で顕著な進歩を遂げており、AlIKED+LightGlueをHardMatchで+18.6 mAA、WxBSで+29.5 mAA、InLocで+21.4 (1m, 10$^\circ$)、RUBIKで+24.2 AUC、IMC 2022で+12.4 mAAで上回った。コードとモデルはhttps://github.com/davnords/LoMa.comで公開しています。

論文の概要: LoMa: Local Feature Matching Revisited

関連論文リスト