Fugu-MT 論文翻訳(概要): EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations

論文の概要: EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations

arxiv url: http://arxiv.org/abs/2602.20958v1
Date: Tue, 24 Feb 2026 14:37:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-25 17:34:53.790444
Title: EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations
Title（参考訳）: EKFを用いた深度カメラとUAVパーソン距離推定のための深度融合
Authors: Luka Šiktar, Branimir Ćaran, Bojan Šekoranja, Marko Švaco,
Abstract要約: 視覚に基づくUAVは、特定の個人を検出して認識し、安全な距離を維持しながら追跡、追跡することで、人間の検索タスクを支援する。 UAVに対する重要な安全性要件は、現実の条件下でのカメラと対象物の距離を正確に推定することである。本稿では,ロバストトラッキングと追従のための深度カメラ計測と単眼カメラ間距離推定の融合について述べる。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Search and rescue (SAR) operations require rapid responses to save lives or property. Unmanned Aerial Vehicles (UAVs) equipped with vision-based systems support these missions through prior terrain investigation or real-time assistance during the mission itself. Vision-based UAV frameworks aid human search tasks by detecting and recognizing specific individuals, then tracking and following them while maintaining a safe distance. A key safety requirement for UAV following is the accurate estimation of the distance between camera and target object under real-world conditions, achieved by fusing multiple image modalities. UAVs with deep learning-based vision systems offer a new approach to the planning and execution of SAR operations. As part of the system for automatic people detection and face recognition using deep learning, in this paper we present the fusion of depth camera measurements and monocular camera-to-body distance estimation for robust tracking and following. Deep learning-based filtering of depth camera data and estimation of camera-to-body distance from a monocular camera are achieved with YOLO-pose, enabling real-time fusion of depth information using the Extended Kalman Filter (EKF) algorithm. The proposed subsystem, designed for use in drones, estimates and measures the distance between the depth camera and the human body keypoints, to maintain the safe distance between the drone and the human target. Our system provides an accurate estimated distance, which has been validated against motion capture ground truth data. The system has been tested in real time indoors, where it reduces the average errors, root mean square error (RMSE) and standard deviations of distance estimation up to 15,3\% in three tested scenarios.
Abstract（参考訳）: 捜索救助(SAR)活動は、生命や財産を救うために迅速に対応する必要がある。無人航空車両 (UAV) はビジョンベースのシステムを備えており、以前の地形調査やミッション自体のリアルタイム支援を通じてこれらのミッションを支援している。視覚に基づくUAVフレームワークは、特定の個人を検出して認識し、安全な距離を維持しながら追跡、追跡することにより、人間の検索タスクを支援する。 UAVにおける重要な安全性要件は、現実の条件下でのカメラと対象物の距離を正確に推定することであり、複数の画像モダリティを融合させることによって達成される。ディープラーニングベースの視覚システムを備えたUAVは、SARオペレーションの計画と実行に新たなアプローチを提供する。本稿では,ディープ・ラーニングを用いた人物の自動検出・顔認識システムの一環として,ディープ・カメラ計測とモノクロ・カメラ間距離推定の融合によるロバストな追跡と追跡について述べる。深度カメラデータのディープラーニングに基づくフィルタリングと、単眼カメラからのカメラ間距離の推定をYOLO-poseで達成し、拡張カルマンフィルタ(EKF)アルゴリズムを用いてリアルタイムに深度情報の融合を可能にする。提案するサブシステムは、ドローンと人体との安全な距離を維持するために、深度カメラと人体キーポイントの間の距離を推定し、測定する。本システムでは, 正確な推定距離を推定し, 実測地真実データに対して検証した。システムは屋内でリアルタイムにテストされており、3つのテストシナリオで平均誤差、ルート平均二乗誤差(RMSE)、標準偏差を最大15,3\%まで低減している。

論文の概要: EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations

関連論文リスト