Fugu-MT 論文翻訳(概要): MAD: Mapping-Aware World Models for Agile Quadrotor Flight

論文の概要: MAD: Mapping-Aware World Models for Agile Quadrotor Flight

arxiv url: http://arxiv.org/abs/2606.04534v1
Date: Wed, 03 Jun 2026 07:17:57 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-04 20:44:18.601578
Title: MAD: Mapping-Aware World Models for Agile Quadrotor Flight
Title（参考訳）: MAD:アジャイル四脚飛行のためのマッピング対応世界モデル
Authors: Xinhong Zhang, Runqing Wang, Yunfan Ren, Ding Yu, Boyu Zhou, Jian Sun, Fang Deng, Jie Chen, Gang Wang,
Abstract要約: マッピング・アウェア・ドリーマー(英: Mapping-Aware Dreamer、MAD)は、視覚に基づく四面体飛行のための幾何学的世界モデルである。 MADは、ロボセントリックな占有率と可視性グリッドマップを再構成する反復潜時ダイナミクスを学習する。室内・屋外での安全な飛行を限定センシングで実証し,シミュレーションで9.66m/s,実世界の森林実験で5.05m/sに達した。
参考スコア（独自算出の注目度）: 47.458890396048126
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Agile quadrotor flight in cluttered scenes requires more than a reactive mapping from a depth image to a control command: the vehicle must remember which regions have been observed, infer nearby occupied space, and act under partial visibility and tight latency. In this paper, we present Mapping-Aware Dreamer (MAD), a geometry-aware world model for vision-based quadrotor flight. Instead of using raw-image reconstruction as the main self-supervised objective, MAD learns recurrent latent dynamics that reconstruct robocentric occupancy and visibility grid maps together with proprioceptive states. This design forces the latent state to encode local geometry, visibility history, and ego-motion in a form that is directly relevant to collision avoidance. MAD is trained in DiffAero using a GPU-parallel map-construction module that provides high-throughput supervision for occupancy and visibility. The learned representation is used in three policy-learning modes: imagination-based MAD-Dreamer and feature-extractor variants based on PPO and SHAC. Across visual navigation and racing tasks, MAD-based agents achieve higher success rates, faster flight, and better cross-task transfer than corresponding vision-only baselines. The model also produces interpretable map predictions and accurate ego-motion estimates from depth observations. We further deploy the learned policy on a physical quadrotor with an Intel RealSense D435i and demonstrate safe indoor and outdoor flight under limited sensing, reaching 9.66 m/s in simulation and 5.05 m/s in real-world forest experiments. These results show that mapping-aware world models provide a practical middle ground between modular aerial navigation and end-to-end learning.
Abstract（参考訳）: 機体はどの領域が観測されたかを記憶し、近接した空間を推測し、部分的な可視性とタイトなレイテンシの下で行動しなければならない。本稿では,視覚に基づく四極子飛行のための幾何学的世界モデルであるマッピング・アウェア・ドリーマー(MAD)を提案する。生画像再構成を主目的とする代わりに、MADは、ロボット中心の占有と可視性グリッドマップを、プロテアーゼ的状態とともに再構成する反復潜時ダイナミクスを学習する。この設計により、潜伏状態は、衝突回避に直接関係する形で、局所幾何学、可視の歴史、エゴ運動を符号化せざるを得ない。 MADはGPU並列マップ構築モジュールを使用してDiffAeroでトレーニングされている。学習された表現は、想像力に基づくMAD-DreamerとPPOとSHACに基づく特徴抽出の3つのポリシー学習モードで使用される。視覚ナビゲーションとレースタスク全体にわたって、MADベースのエージェントは、より高い成功率、高速飛行、および対応する視覚のみのベースラインよりも優れたクロスタスク転送を達成する。このモデルはまた、深度観測から解釈可能な地図予測と正確なエゴモーション推定を生成する。我々はさらに、Intel RealSense D435iによる物理四極子に学習ポリシーをデプロイし、限られた感知下での屋内および屋外の安全な飛行を実証し、シミュレーションで9.66m/s、実世界の森林実験で5.05m/sに達する。これらの結果から,地図対応の世界モデルは,モジュール型ナビゲーションとエンド・ツー・エンド・ラーニングの中間地点となることが示唆された。

論文の概要: MAD: Mapping-Aware World Models for Agile Quadrotor Flight

関連論文リスト