Fugu-MT 論文翻訳(概要): Fusion-Poly: A Polyhedral Framework Based on Spatial-Temporal Fusion for 3D Multi-Object Tracking

論文の概要: Fusion-Poly: A Polyhedral Framework Based on Spatial-Temporal Fusion for 3D Multi-Object Tracking

arxiv url: http://arxiv.org/abs/2603.08199v1
Date: Mon, 09 Mar 2026 10:26:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:15.792108
Title: Fusion-Poly: A Polyhedral Framework Based on Spatial-Temporal Fusion for 3D Multi-Object Tracking
Title（参考訳）: Fusion-Poly:3次元多対象追跡のための空間時間融合に基づく多面体フレームワーク
Authors: Xian Wu, Yitao Wu, Xiaoyu Li, Zijia Li, Lijun Zhao, Lining Sun,
Abstract要約: Fusion-Polyは、非同期LiDARとカメラデータを統合する3D MOTのための時空間融合フレームワークである。 nuScenesテストセットでは、Fusion-Polyは76.5%のAMOTAを達成した。
参考スコア（独自算出の注目度）: 11.834891226231898
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: LiDAR-camera 3D multi-object tracking (MOT) combines rich visual semantics with accurate depth cues to improve trajectory consistency and tracking reliability. In practice, however, LiDAR and cameras operate at different sampling rates. To maintain temporal alignment, existing data pipelines usually synchronize heterogeneous sensor streams and annotate them at a reduced shared frequency, forcing most prior methods to perform spatial fusion only at synchronized timestamps through projection-based or learnable cross-sensor association. As a result, abundant asynchronous observations remain underexploited, despite their potential to support more frequent association and more robust trajectory estimation over short temporal intervals. To address this limitation, we propose Fusion-Poly, a spatial-temporal fusion framework for 3D MOT that integrates asynchronous LiDAR and camera data. Fusion-Poly associates trajectories with multi-modal observations at synchronized timestamps and with single-modal observations at asynchronous timestamps, enabling higher-frequency updates of motion and existence states. The framework contains three key components: a frequency-aware cascade matching module that adapts to synchronized and asynchronous frames according to available detection modalities; a frequency-aware trajectory estimation module that maintains trajectories through high-frequency motion prediction, differential updates, and confidence-calibrated lifecycle management; and a full-state observation alignment module that improves cross-modal consistency at synchronized timestamps by optimizing image-projection errors. On the nuScenes test set, Fusion-Poly achieves 76.5% AMOTA, establishing a new state of the art among tracking-by-detection 3D MOT methods. Extensive ablation studies further validate the effectiveness of each component. Code will be released.
Abstract（参考訳）: LiDAR-camera 3D Multi-Object Tracking (MOT)は、リッチな視覚的セマンティクスと正確な深度キューを組み合わせることで、軌道の整合性とトラッキングの信頼性を向上させる。しかし実際には、LiDARとカメラは異なるサンプリングレートで動作している。時間的アライメントを維持するために、既存のデータパイプラインは、通常、異種センサストリームを同期させ、共有周波数を下げてアノテートする。その結果、短い時間間隔でより頻繁な関連とより堅牢な軌道推定をサポートする可能性にもかかわらず、豊富な非同期観測は未解明のままである。この制限に対処するため,非同期LiDARとカメラデータを統合する3次元MOTのための時空間融合フレームワークFusion-Polyを提案する。 Fusion-Polyは、軌道を同期タイムスタンプでのマルチモーダル観測と非同期タイムスタンプでの単一モーダル観測に関連付け、動きと存在状態の高頻度更新を可能にする。フレームワークは、利用可能な検出モードに応じて同期化および非同期フレームに適応する周波数対応カスケードマッチングモジュールと、高周波モーション予測、差分更新、信頼度調整によるライフサイクル管理を通じて軌道を維持する周波数対応軌道推定モジュールと、画像投影誤差を最適化して同期タイムスタンプにおけるクロスモーダル整合性を改善するフルステート監視アライメントモジュールとを含む。 nuScenesテストセットでは、Fusion-Polyは76.5%のAMOTAを達成した。広範囲にわたるアブレーション研究は、各成分の有効性をさらに検証する。コードはリリースされる。

論文の概要: Fusion-Poly: A Polyhedral Framework Based on Spatial-Temporal Fusion for 3D Multi-Object Tracking

関連論文リスト