Fugu-MT 論文翻訳(概要): A Simple yet Powerful Instance-Aware Prompting Framework for Training-free Camouflaged Object Segmentation

論文の概要: A Simple yet Powerful Instance-Aware Prompting Framework for Training-free Camouflaged Object Segmentation

arxiv url: http://arxiv.org/abs/2508.06904v2
Date: Wed, 13 Aug 2025 07:40:08 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-14 14:06:00.553217
Title: A Simple yet Powerful Instance-Aware Prompting Framework for Training-free Camouflaged Object Segmentation
Title（参考訳）: トレーニング不要なカモフラージュオブジェクトセグメンテーションのためのシンプルでパワフルなインスタンス対応プロンプトフレームワーク
Authors: Chao Yin, Jide Li, Xiaoqiang Li,
Abstract要約: タスクジェネリックプロンプトをきめ細かなインスタンスマスクに明示的に変換する,トレーニング不要なCamouflaged Objectパイプラインを提案する。提案したIAPFは、既存の最先端のトレーニングフリーなCOSメソッドを大幅に上回っている。
参考スコア（独自算出の注目度）: 6.712332323439369
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Camouflaged Object Segmentation (COS) remains highly challenging due to the intrinsic visual similarity between target objects and their surroundings. While training-based COS methods achieve good performance, their performance degrades rapidly with increased annotation sparsity. To circumvent this limitation, recent studies have explored training-free COS methods, leveraging the Segment Anything Model (SAM) by automatically generating visual prompts from a single task-generic prompt (\textit{e.g.}, "\textit{camouflaged animal}") uniformly applied across all test images. However, these methods typically produce only semantic-level visual prompts, causing SAM to output coarse semantic masks and thus failing to handle scenarios with multiple discrete camouflaged instances effectively. To address this critical limitation, we propose a simple yet powerful \textbf{I}nstance-\textbf{A}ware \textbf{P}rompting \textbf{F}ramework (IAPF), the first training-free COS pipeline that explicitly converts a task-generic prompt into fine-grained instance masks. Specifically, the IAPF comprises three steps: (1) Text Prompt Generator, utilizing task-generic queries to prompt a Multimodal Large Language Model (MLLM) for generating image-specific foreground and background tags; (2) \textbf{Instance Mask Generator}, leveraging Grounding DINO to produce precise instance-level bounding box prompts, alongside the proposed Single-Foreground Multi-Background Prompting strategy to sample region-constrained point prompts within each box, enabling SAM to yield a candidate instance mask; (3) Self-consistency Instance Mask Voting, which selects the final COS prediction by identifying the candidate mask most consistent across multiple candidate instance masks. Extensive evaluations on standard COS benchmarks demonstrate that the proposed IAPF significantly surpasses existing state-of-the-art training-free COS methods.
Abstract（参考訳）: カモフラージュされた対象セグメンテーション(COS)は、対象物とその周囲の内在的な視覚的類似性のため、非常に困難である。トレーニングベースのCOSメソッドは優れたパフォーマンスを実現するが、アノテーションの間隔が大きくなると性能は急速に低下する。この制限を回避するために、最近の研究では、単一のタスク生成プロンプト(\textit{e g }, "\textit{camouflaged animal}")から視覚的なプロンプトを自動的に生成することで、SAM(Segment Anything Model)を活用して、トレーニング不要のCOS手法を模索している。しかし、これらの手法は通常、意味レベルの視覚的なプロンプトしか生成せず、SAMは粗いセマンティックマスクを出力し、複数の個別のカモフラージュされたインスタンスでシナリオを効果的に処理できない。この限界に対処するために、タスク生成プロンプトをきめ細かなインスタンスマスクに明示的に変換する最初のトレーニング不要なCOSパイプラインである、単純だが強力な \textbf{I}nstance-\textbf{A}ware \textbf{P}rompting \textbf{F}ramework (IAPF) を提案する。特に、IAPFは、(1)タスクジェネリッククエリを利用して、画像固有のフォアグラウンドとバックグラウンドタグを生成するためのMLLM(Multimodal Large Language Model)、(2) \textbf{Instance Mask Generator}、(2)Golding DINOを利用して正確にインスタンスレベルのバウンディングボックスプロンプトを生成し、提案されているシングルフォアグラウンドのマルチ背景プロンプトと共に各ボックス内の領域制約されたポイントプロンプトをサンプリングし、SAMが候補インスタンスマスクを取得できるようにする、(3)自己整合性インスタンスマスクの投票(Self-consistency Instance Mask Voting)という3つのステップで構成されている。標準COSベンチマークの大規模な評価は、提案したIAPFが既存の最先端のトレーニングフリーCOSメソッドを大幅に上回っていることを示している。

論文の概要: A Simple yet Powerful Instance-Aware Prompting Framework for Training-free Camouflaged Object Segmentation

関連論文リスト