Fugu-MT 論文翻訳(概要): FlyCo: Foundation Model-Empowered Drones for Autonomous 3D Structure Scanning in Open-World Environments

論文の概要: FlyCo: Foundation Model-Empowered Drones for Autonomous 3D Structure Scanning in Open-World Environments

arxiv url: http://arxiv.org/abs/2601.07558v1
Date: Mon, 12 Jan 2026 14:14:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:40.744484
Title: FlyCo: Foundation Model-Empowered Drones for Autonomous 3D Structure Scanning in Open-World Environments
Title（参考訳）: FlyCo: オープンワールド環境における自律型3D構造スキャンのための基礎モデル駆動型ドローン
Authors: Chen Feng, Guiyong Zheng, Tengkai Zhuang, Yongqian Wu, Fangzhan He, Haojia Li, Juepeng Zheng, Shaojie Shen, Boyu Zhou,
Abstract要約: FlyCoはFMを利用した知覚予測計画ループである。多様なオープンワールド環境で、完全に自律的で、プロンプト駆動の3Dターゲットスキャンを可能にする。 FlyCoは正確なアブレーションシーン理解、高効率、リアルタイム安全性を提供する。
参考スコア（独自算出の注目度）: 26.006291392930844
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous 3D scanning of open-world target structures via drones remains challenging despite broad applications. Existing paradigms rely on restrictive assumptions or effortful human priors, limiting practicality, efficiency, and adaptability. Recent foundation models (FMs) offer great potential to bridge this gap. This paper investigates a critical research problem: What system architecture can effectively integrate FM knowledge for this task? We answer it with FlyCo, a principled FM-empowered perception-prediction-planning loop enabling fully autonomous, prompt-driven 3D target scanning in diverse unknown open-world environments. FlyCo directly translates low-effort human prompts (text, visual annotations) into precise adaptive scanning flights via three coordinated stages: (1) perception fuses streaming sensor data with vision-language FMs for robust target grounding and tracking; (2) prediction distills FM knowledge and combines multi-modal cues to infer the partially observed target's complete geometry; (3) planning leverages predictive foresight to generate efficient and safe paths with comprehensive target coverage. Building on this, we further design key components to boost open-world target grounding efficiency and robustness, enhance prediction quality in terms of shape accuracy, zero-shot generalization, and temporal stability, and balance long-horizon flight efficiency with real-time computability and online collision avoidance. Extensive challenging real-world and simulation experiments show FlyCo delivers precise scene understanding, high efficiency, and real-time safety, outperforming existing paradigms with lower human effort and verifying the proposed architecture's practicality. Comprehensive ablations validate each component's contribution. FlyCo also serves as a flexible, extensible blueprint, readily leveraging future FM and robotics advances. Code will be released.
Abstract（参考訳）: ドローンによるオープンワールドのターゲット構造の自律的な3Dスキャンは、幅広い応用にもかかわらず難しいままだ。既存のパラダイムは、現実性、効率性、適応性を制限し、制限的な仮定や厳格な人間の優先に頼っている。最近の基礎モデル(FM)は、このギャップを埋める大きな可能性を秘めている。本稿では,この課題に対してFM知識を効果的に統合できるシステムアーキテクチャについて検討する。 FMを利用した知覚予測計画ループであるFlyCoは、未知のオープンワールド環境において、完全に自律的で、即時駆動の3Dターゲットスキャンを可能にする。 FlyCoは,低便な人間のプロンプト(テキスト,視覚的アノテーション)を直接3つの調整段階を通じて正確な適応型スキャン飛行に変換する。(1)知覚は,強固な目標の接地と追跡のための視覚言語FMとストリーミングセンサデータを融合する;(2)予測はFM知識を蒸留し,部分的な観測対象の完全な幾何学を推測するためのマルチモーダルキューを組み合わせる;(3)計画は,予測的フォレストを活用して,包括的目標カバレッジを持つ効率的で安全な経路を生成する。これに基づいて、我々は、オープンワールドの目標接地効率とロバスト性を高め、形状精度、ゼロショット一般化、時間安定性の観点から予測品質を高め、リアルタイム計算性とオンライン衝突回避とのバランスをとるための重要なコンポーネントをさらに設計する。大規模な現実とシミュレーションの実験は、FlyCoが正確なシーン理解、高効率、リアルタイムの安全性を提供し、人間の努力を減らして既存のパラダイムを上回り、提案されたアーキテクチャの実用性を検証していることを示している。全体的な説明は各コンポーネントの貢献を検証します。 FlyCoはフレキシブルで拡張可能なブループリントとしても機能し、将来のFMやロボティクスの進歩を容易に活用できる。コードはリリースされる。

論文の概要: FlyCo: Foundation Model-Empowered Drones for Autonomous 3D Structure Scanning in Open-World Environments

関連論文リスト