Fugu-MT 論文翻訳(概要): First RAG, Second SEG: A Training-Free Paradigm for Camouflaged Object Detection

論文の概要: First RAG, Second SEG: A Training-Free Paradigm for Camouflaged Object Detection

arxiv url: http://arxiv.org/abs/2508.15313v2
Date: Mon, 15 Sep 2025 05:21:07 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-16 15:23:16.348176
Title: First RAG, Second SEG: A Training-Free Paradigm for Camouflaged Object Detection
Title（参考訳）: RAG, Second SEG:カモフラージュ物体検出のためのトレーニングフリーパラダイム
Authors: Wutao Liu, YiDan Wang, Pan Gao,
Abstract要約: 既存のアプローチは、しばしば重い訓練と大きな計算資源に依存している。 RAG-SEGはCODを2段階に分離し,粗いマスクをプロンプトとして生成するRAG(Retrieval-Augmented Generation)と,改良のためのSAMベースセグメンテーション(SEG)の2つを提案する。 RAG-SEGは、教師なしクラスタリングによってコンパクトな検索データベースを構築し、高速かつ効果的な特徴検索を可能にする。ベンチマークCODデータセットの実験では、RAG-SEGが最先端の手法に匹敵する性能を示した。
参考スコア（独自算出の注目度）: 14.070196423996045
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Camouflaged object detection (COD) poses a significant challenge in computer vision due to the high similarity between objects and their backgrounds. Existing approaches often rely on heavy training and large computational resources. While foundation models such as the Segment Anything Model (SAM) offer strong generalization, they still struggle to handle COD tasks without fine-tuning and require high-quality prompts to yield good performance. However, generating such prompts manually is costly and inefficient. To address these challenges, we propose \textbf{First RAG, Second SEG (RAG-SEG)}, a training-free paradigm that decouples COD into two stages: Retrieval-Augmented Generation (RAG) for generating coarse masks as prompts, followed by SAM-based segmentation (SEG) for refinement. RAG-SEG constructs a compact retrieval database via unsupervised clustering, enabling fast and effective feature retrieval. During inference, the retrieved features produce pseudo-labels that guide precise mask generation using SAM2. Our method eliminates the need for conventional training while maintaining competitive performance. Extensive experiments on benchmark COD datasets demonstrate that RAG-SEG performs on par with or surpasses state-of-the-art methods. Notably, all experiments are conducted on a \textbf{personal laptop}, highlighting the computational efficiency and practicality of our approach. We present further analysis in the Appendix, covering limitations, salient object detection extension, and possible improvements. \textcolor{blue} {Code: https://github.com/Lwt-diamond/RAG-SEG.}
Abstract（参考訳）: カモフラージュされた物体検出(COD)は、物体とその背景との高い類似性のため、コンピュータビジョンにおいて重要な課題となる。既存のアプローチは、しばしば重い訓練と大きな計算資源に依存している。 Segment Anything Model (SAM) のような基礎モデルは強力な一般化を提供するが、彼らは細調整なしでCODタスクを扱うのに苦慮し、優れたパフォーマンスを得るために高品質なプロンプトを必要とする。しかし、そのようなプロンプトを手動で生成することはコストがかかり非効率である。これらの課題に対処するため、CODを2段階に分離する訓練自由パラダイムである「textbf{First RAG, Second SEG(RAG-SEG)」を提案し、粗いマスクをプロンプトとして生成する「RAG(Retrieval-Augmented Generation)」と、改良のためのSAMベースセグメンテーション(SEG)を提案する。 RAG-SEGは、教師なしクラスタリングによってコンパクトな検索データベースを構築し、高速かつ効果的な特徴検索を可能にする。推測中、検索した特徴はSAM2を使用して正確なマスク生成を誘導する擬似ラベルを生成する。本手法は,競争力を維持しつつ,従来の訓練の必要性を解消する。ベンチマークCODデータセットに関する大規模な実験は、RAG-SEGが最先端の手法に匹敵する性能を示した。特に、全ての実験は、我々のアプローチの計算効率と実用性を強調した『textbf{personal laptop}』で実施されている。我々はAppendixでさらに分析を行い、制限、健全なオブジェクト検出拡張、改善の可能性について紹介する。 https://github.com/Lwt-diamond/RAG-SEG。 ※

論文の概要: First RAG, Second SEG: A Training-Free Paradigm for Camouflaged Object Detection

関連論文リスト