Fugu-MT 論文翻訳(概要): SF3D-RGB: Scene Flow Estimation from Monocular Camera and Sparse LiDAR

論文の概要: SF3D-RGB: Scene Flow Estimation from Monocular Camera and Sparse LiDAR

arxiv url: http://arxiv.org/abs/2602.21699v1
Date: Wed, 25 Feb 2026 09:03:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-26 18:19:16.765243
Title: SF3D-RGB: Scene Flow Estimation from Monocular Camera and Sparse LiDAR
Title（参考訳）: SF3D-RGB:単眼カメラとスパースLiDARからのシーンフロー推定
Authors: Rajai Alhimdiat, Ramy Battrawy, René Schuster, Didier Stricker, Wesam Ashour,
Abstract要約: 本稿では,2次元単眼画像と3次元点雲を用いたスパースシーンフロー推定のためのディープラーニングアーキテクチャを提案する。私たちのアーキテクチャはエンド・ツー・エンドのモデルで、まず各モダリティから情報を機能にエンコードし、それらを融合させます。実験により,提案手法は単一モダリティ法より優れ,実世界のデータセット上でのシーンフローの精度が向上することが示された。
参考スコア（独自算出の注目度）: 17.224692757126153
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Scene flow estimation is an extremely important task in computer vision to support the perception of dynamic changes in the scene. For robust scene flow, learning-based approaches have recently achieved impressive results using either image-based or LiDAR-based modalities. However, these methods have tended to focus on the use of a single modality. To tackle these problems, we present a deep learning architecture, SF3D-RGB, that enables sparse scene flow estimation using 2D monocular images and 3D point clouds (e.g., acquired by LiDAR) as inputs. Our architecture is an end-to-end model that first encodes information from each modality into features and fuses them together. Then, the fused features enhance a graph matching module for better and more robust mapping matrix computation to generate an initial scene flow. Finally, a residual scene flow module further refines the initial scene flow. Our model is designed to strike a balance between accuracy and efficiency. Furthermore, experiments show that our proposed method outperforms single-modality methods and achieves better scene flow accuracy on real-world datasets while using fewer parameters compared to other state-of-the-art methods with fusion.
Abstract（参考訳）: シーンフロー推定は、シーンの動的変化の知覚を支援するために、コンピュータビジョンにおいて非常に重要なタスクである。堅牢なシーンフローでは、画像ベースまたはLiDARベースのモダリティを使用して、学習ベースのアプローチが目覚ましい結果を得た。しかし、これらの手法は単一のモダリティの使用に集中する傾向にある。これらの問題に対処するために,2次元単眼画像と3次元点雲(例えばLiDARが取得)を入力としてスパースシーンフロー推定が可能な深層学習アーキテクチャSF3D-RGBを提案する。私たちのアーキテクチャはエンド・ツー・エンドのモデルで、まず各モダリティから情報を機能にエンコードし、それらを融合させます。そして、融合した特徴はグラフマッチングモジュールを拡張して、より良くより堅牢なマッピング行列計算を行い、初期シーンフローを生成する。最後に、残留シーンフローモジュールは、初期シーンフローをさらに洗練する。私たちのモデルは精度と効率のバランスをとるように設計されています。さらに,本手法は単一モダリティ法より優れ,実際のデータセット上でのシーンフローの精度が向上することを示す。

論文の概要: SF3D-RGB: Scene Flow Estimation from Monocular Camera and Sparse LiDAR

関連論文リスト