Fugu-MT 論文翻訳(概要): On-the-Fly OVD Adaptation with FLAME: Few-shot Localization via Active Marginal-Samples Exploration

論文の概要: On-the-Fly OVD Adaptation with FLAME: Few-shot Localization via Active Marginal-Samples Exploration

arxiv url: http://arxiv.org/abs/2510.17670v1
Date: Mon, 20 Oct 2025 15:41:55 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 00:56:39.507419
Title: On-the-Fly OVD Adaptation with FLAME: Few-shot Localization via Active Marginal-Samples Exploration
Title（参考訳）: FLAMEを用いたオンザフライOVD適応:アクティブマージナルサンプルズ探索によるFew-shotローカライゼーション
Authors: Yehonathan Refael, Amit Aides, Aviad Barzilai, George Leifman, Genady Beryozkin, Vered Silverman, Bolous Jaber, Tomer Shekel,
Abstract要約: オープンボキャブラリオブジェクト検出(OVD)モデルは、任意のテキストクエリからオブジェクトを検出することで、顕著な柔軟性を提供する。リモートセンシング(RS)のような特殊なドメインにおけるゼロショットのパフォーマンスは、自然言語固有の曖昧さによってしばしば損なわれる。そこで本研究では,大規模な事前学習型OVDモデルの広範な一般化と,軽量な数ショット分類器を併用したケースケード手法を提案する。
参考スコア（独自算出の注目度）: 1.7975230539002824
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Open-vocabulary object detection (OVD) models offer remarkable flexibility by detecting objects from arbitrary text queries. However, their zero-shot performance in specialized domains like Remote Sensing (RS) is often compromised by the inherent ambiguity of natural language, limiting critical downstream applications. For instance, an OVD model may struggle to distinguish between fine-grained classes such as "fishing boat" and "yacht" since their embeddings are similar and often inseparable. This can hamper specific user goals, such as monitoring illegal fishing, by producing irrelevant detections. To address this, we propose a cascaded approach that couples the broad generalization of a large pre-trained OVD model with a lightweight few-shot classifier. Our method first employs the zero-shot model to generate high-recall object proposals. These proposals are then refined for high precision by a compact classifier trained in real-time on only a handful of user-annotated examples - drastically reducing the high costs of RS imagery annotation.The core of our framework is FLAME, a one-step active learning strategy that selects the most informative samples for training. FLAME identifies, on the fly, uncertain marginal candidates near the decision boundary using density estimation, followed by clustering to ensure sample diversity. This efficient sampling technique achieves high accuracy without costly full-model fine-tuning and enables instant adaptation, within less then a minute, which is significantly faster than state-of-the-art alternatives.Our method consistently surpasses state-of-the-art performance on RS benchmarks, establishing a practical and resource-efficient framework for adapting foundation models to specific user needs.
Abstract（参考訳）: オープンボキャブラリオブジェクト検出(OVD)モデルは、任意のテキストクエリからオブジェクトを検出することで、顕著な柔軟性を提供する。しかし、Remote Sensing (RS)のような特殊なドメインにおけるゼロショットのパフォーマンスは、しばしば自然言語固有の曖昧さによって損なわれ、重要な下流アプリケーションを制限する。例えば、OVDモデルは「漁船」や「ヨット」のような微細なクラスを区別するのに苦労することがある。これは、無関係な検出を生成することによって、違法な釣りの監視など、特定のユーザ目標を阻害する可能性がある。そこで本研究では,大規模な事前学習型OVDモデルの広範な一般化と,軽量な数ショット分類器を併用したカスケード手法を提案する。提案手法はまずゼロショットモデルを用いてハイリコールオブジェクトの提案を生成する。これらの提案は,一握りのユーザアノテート例に基づいて,リアルタイムに学習したコンパクトな分類器によって高精度に改良され,RS画像アノテーションの高コストを大幅に削減する。 FLAMEは、密度推定を用いて決定境界付近で不確実な限界候補を特定し、続いてクラスタリングを行い、サンプルの多様性を確実にする。この効率的なサンプリング技術は,コストのかかるフルモデルファインチューニングを伴わずに高精度に実現し,最先端の代替品よりもはるかに高速な1分以内の即時適応を可能にし,我々の手法は,RSベンチマークの最先端性能を一貫して上回り,基礎モデルを特定のユーザのニーズに適応するための実用的で資源効率のよいフレームワークを確立する。

論文の概要: On-the-Fly OVD Adaptation with FLAME: Few-shot Localization via Active Marginal-Samples Exploration

関連論文リスト