Fugu-MT 論文翻訳(概要): Excite, Attend and Segment (EASe): Domain-Agnostic Fine-Grained Mask Discovery with Feature Calibration and Self-Supervised Upsampling

論文の概要: Excite, Attend and Segment (EASe): Domain-Agnostic Fine-Grained Mask Discovery with Feature Calibration and Self-Supervised Upsampling

arxiv url: http://arxiv.org/abs/2604.00276v1
Date: Tue, 31 Mar 2026 22:03:15 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-02 16:44:31.743127
Title: Excite, Attend and Segment (EASe): Domain-Agnostic Fine-Grained Mask Discovery with Feature Calibration and Self-Supervised Upsampling
Title（参考訳）: Excite, Attend and Segment (EASe):Domain-Agnostic Fine-Grained Mask Discovery with Feature Calibration and Self-Supervised Upsampling
Authors: Deepank Singh, Anurag Nihal, Vedhus Hoskere,
Abstract要約: Excite, Attend and Segment (EASe)は、教師なしドメインに依存しないセマンティックセグメンテーションフレームワークである。本評価は,従来の最先端技術よりもEASeの優れた性能を示すものである。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unsupervised segmentation approaches have increasingly leveraged foundation models (FM) to improve salient object discovery. However, these methods often falter in scenes with complex, multi-component morphologies, where fine-grained structural detail is indispensable. Many state-of-the-art unsupervised segmentation pipelines rely on mask discovery approaches that utilize coarse, patch-level representations. These coarse representations inherently suppress the fine-grained detail required to resolve such complex morphologies. To overcome this limitation, we propose Excite, Attend and Segment (EASe), an unsupervised domain-agnostic semantic segmentation framework for easy fine-grained mask discovery across challenging real-world scenes. EASe utilizes novel Semantic-Aware Upsampling with Channel Excitation (SAUCE) to excite low-resolution FM feature channels for selective calibration and attends across spatially-encoded image and FM features to recover full-resolution semantic representations. Finally, EASe segments the aggregated features into multi-granularity masks using a novel training-free Cue-Attentive Feature Aggregator (CAFE) which leverages SAUCE attention scores as a semantic grouping signal. EASe, together with SAUCE and CAFE, operate directly at pixel-level feature representations to enable accurate fine-grained dense semantic mask discovery. Our evaluation demonstrates superior performance of EASe over previous state-of-the-arts (SOTAs) across major standard benchmarks and diverse datasets with complex morphologies. Code is available at https://ease-project.github.io
Abstract（参考訳）: 教師なしセグメンテーションアプローチは、健全なオブジェクト発見を改善するためにファンデーションモデル(FM)をますます活用している。しかし、これらの手法は複雑な多成分形態を持つ場面でしばしば失敗し、きめ細かい構造的詳細は不可欠である。多くの最先端の教師なしセグメンテーションパイプラインは、粗いパッチレベルの表現を利用するマスク発見アプローチに依存している。これらの粗い表現は、そのような複雑な形態を解くのに必要な細かな詳細さを本質的に抑制する。この制限を克服するために,現実のシーンを横断するきめ細かなマスク発見を容易にする,教師なしドメインに依存しないセマンティックセマンティックセマンティクスフレームワークであるExcite, Attend and Segment (EASe)を提案する。 EASeは、新しいSemantic-Aware Upsampling with Channel Excitation (SAUCE)を使用して、選択的キャリブレーションのための低解像度FM特徴チャネルをエキサイティングし、空間的に符号化された画像とFM特徴をまたいでフル解像度セマンティック表現を復元する。最後に、EASeは、SAUCEの注意スコアをセマンティックグルーピング信号として活用する、新しいトレーニング不要なCue-Attentive Feature Aggregator (CAFE)を用いて、集約された特徴を多粒性マスクに分割する。 EASeとSAUCEとCAFEは、ピクセルレベルの特徴表現を直接操作して、きめ細かいセマンティックマスクの正確な発見を可能にする。本評価は,従来の標準ベンチマークや複雑な形態を持つ多種多様なデータセットにおいて,従来の最先端技術(SOTA)よりもEASeの方が優れた性能を示す。コードはhttps://ease-project.github.ioで公開されている。

論文の概要: Excite, Attend and Segment (EASe): Domain-Agnostic Fine-Grained Mask Discovery with Feature Calibration and Self-Supervised Upsampling

関連論文リスト