Fugu-MT 論文翻訳(概要): Aerial Monocular 3D Object Detection

論文の概要: Aerial Monocular 3D Object Detection

arxiv url: http://arxiv.org/abs/2208.03974v2
Date: Mon, 20 Jan 2025 18:54:51 GMT
ステータス: 翻訳完了
システム内更新日: 2025-01-23 16:42:35.366441
Title: Aerial Monocular 3D Object Detection
Title（参考訳）: 空中モノクル3次元物体検出
Authors: Yue Hu, Shaoheng Fang, Weidi Xie, Siheng Chen,
Abstract要約: DVDETは2次元画像空間と3次元物理空間の両方で空中単分子3次元物体検出を実現するために提案される。高度視差変形問題に対処するため,新しい測地変形変換モジュールを提案する。より多くの研究者がこの領域を調査するよう促すため、データセットと関連するコードをリリースします。
参考スコア（独自算出の注目度）: 67.20369963664314
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Drones equipped with cameras can significantly enhance human ability to perceive the world because of their remarkable maneuverability in 3D space. Ironically, object detection for drones has always been conducted in the 2D image space, which fundamentally limits their ability to understand 3D scenes. Furthermore, existing 3D object detection methods developed for autonomous driving cannot be directly applied to drones due to the lack of deformation modeling, which is essential for the distant aerial perspective with sensitive distortion and small objects. To fill the gap, this work proposes a dual-view detection system named DVDET to achieve aerial monocular object detection in both the 2D image space and the 3D physical space. To address the severe view deformation issue, we propose a novel trainable geo-deformable transformation module that can properly warp information from the drone's perspective to the BEV. Compared to the monocular methods for cars, our transformation includes a learnable deformable network for explicitly revising the severe deviation. To address the dataset challenge, we propose a new large-scale simulation dataset named AM3D-Sim, generated by the co-simulation of AirSIM and CARLA, and a new real-world aerial dataset named AM3D-Real, collected by DJI Matrice 300 RTK, in both datasets, high-quality annotations for 3D object detection are provided. Extensive experiments show that i) aerial monocular 3D object detection is feasible; ii) the model pre-trained on the simulation dataset benefits real-world performance, and iii) DVDET also benefits monocular 3D object detection for cars. To encourage more researchers to investigate this area, we will release the dataset and related code in https://github.com/PhyllisH/DVDET.
Abstract（参考訳）: カメラを装備したドローンは、3D空間における顕著な操作性のために、人間の世界知覚能力を大幅に向上させることができる。皮肉なことに、ドローンの物体検出は常に2D画像空間で行われており、3Dシーンを理解する能力は基本的に制限されている。さらに, 自律走行のために開発された既存の3次元物体検出法は, 変形モデリングの欠如によりドローンに直接適用できない。このギャップを埋めるために,2次元画像空間と3次元物理空間の両方において空中単分子物体検出を実現するために,DVDETというデュアルビュー検出システムを提案する。高度視差変形問題に対処するため,無人機からBEVへ情報を適切にワープ可能な,トレーニング可能なジオデフォルマブルトランスフォーメーションモジュールを提案する。車両の単分子法と比較して、当社の変換には、厳密な偏差を明示的に修正するための学習可能な変形可能なネットワークが含まれています。この課題に対処するために,AirSIMとCARLAの共同シミュレーションによって生成されたAM3D-Simという新しい大規模シミュレーションデータセットと,DJI Matrice 300 RTKが収集したAM3D-Realという新しい実世界の空中データセットを提案する。大規模な実験は一空中単眼の立体物検出が可能であること。二シミュレーションデータセットに事前訓練されたモデルは、実世界のパフォーマンスを享受し、三 DVDETは、自動車のモノクル3Dオブジェクト検出にも有用である。より多くの研究者がこの領域について調査することを奨励するため、データセットと関連するコードをhttps://github.com/PhyllisH/DVDET.comに公開します。

論文の概要: Aerial Monocular 3D Object Detection

関連論文リスト