Fugu-MT 論文翻訳(概要): Hard to See, Hard to Label: Generative and Symbolic Acquisition for Subtle Visual Phenomena

論文の概要: Hard to See, Hard to Label: Generative and Symbolic Acquisition for Subtle Visual Phenomena

arxiv url: http://arxiv.org/abs/2604.22990v2
Date: Tue, 28 Apr 2026 02:24:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-29 14:06:43.825778
Title: Hard to See, Hard to Label: Generative and Symbolic Acquisition for Subtle Visual Phenomena
Title（参考訳）: 分かりづらい、ラベル付けが難しい:サブトル視覚現象の創発的・象徴的獲得
Authors: Renjith Prasad, Rishabh Sharma, Andrew E. Shao, Annmary Justine Koomthanam, Shreyas Kulkarni, Suparna Bhattacharya, Martin Foltin, Amit Sheth, David Orozco, Matthew Quinn, Brian Sammuli,
Abstract要約: 本稿では,拡散に基づく難易度信号と階層的セマンティックカバレッジを結合したオブジェクト検出のための能動的学習フレームワークを提案する。視覚的困難とセマンティックカバレッジのバランスをとることで、GSALは不確実性のみの選択によってしばしば見逃される微妙で稀なターゲットの検索を改善する。
参考スコア（独自算出の注目度）: 3.9275663040435
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Subtle visual anomalies such as hairline cracks, sub-millimeter voids, and low-contrast inclusions are structurally atypical yet visually ambiguous, making them both difficult to annotate and easy to overlook during active learning. Standard acquisition heuristics based on discriminative uncertainty or feature diversity often overselect dominant patterns while underexploring sparse yet important regions of the data space. This failure mode is especially severe in industrial defect inspection, where anomalies may be both low-prevalence and difficult to distinguish from surrounding structure. To resolve this, we propose GSAL, an active learning framework for object detection that combines a diffusion-based difficulty signal with a hierarchical semantic coverage prior. The diffusion component scores images and proposals using reconstruction discrepancy and denoising variability, prioritizing visually atypical or ambiguous examples. However, diffusion alone does not prevent acquisition from repeatedly favoring hard samples within dominant semantic modes. The semantic component therefore organizes candidate samples in a three-level concept graph and promotes coverage of underrepresented semantic regions while providing interpretable acquisition rationales. By balancing visual difficulty with semantic coverage, GSAL improves retrieval of subtle and rare targets that are often missed by uncertainty-only selection. Experiments on a proprietary thin-film defect, Pascal VOC and MS COCO dataset show consistent gains in label efficiency and rare-class retrieval over uncertainty-, diversity-, and hybrid-based baselines
Abstract（参考訳）: ヘアライン・クラック、サブミリ・ヴォイド、低コントラスト・インクルージョンといったサブセットの視覚異常は、構造的に非典型的であるが、視覚的に曖昧である。差別的不確実性や特徴の多様性に基づく標準的な買収ヒューリスティックスは、しばしばデータ空間の希少かつ重要な領域を探索しながら、支配的なパターンを過剰に選択する。この異常モードは特に工業欠陥検査において深刻であり、異常は低頻度であり、周囲の構造と区別することが困難である。これを解決するために,拡散に基づく難易度信号と階層的セマンティックカバレッジを結合したオブジェクト検出のための能動的学習フレームワークであるGSALを提案する。拡散成分は、視覚的に非典型的または曖昧な例を優先して、再構成の相違と変分性を利用して画像と提案をスコアする。しかし拡散だけでは、支配的なセマンティックモード内でのハードサンプルの取得が繰り返し行われるのを防げない。したがって、セマンティックコンポーネントは、3レベルの概念グラフに候補サンプルを整理し、解釈可能な獲得合理性を提供しながら、表現されていないセマンティック領域のカバレッジを促進する。視覚的困難とセマンティックカバレッジのバランスをとることで、GSALは不確実性のみの選択によってしばしば見逃される微妙で稀なターゲットの検索を改善する。プロプライエタリな薄膜欠陥、パスカルVOCおよびMS COCOデータセットの実験は、不確実性、多様性、およびハイブリッドベースラインに対するラベル効率とレアクラスの検索において一貫した増加を示す

論文の概要: Hard to See, Hard to Label: Generative and Symbolic Acquisition for Subtle Visual Phenomena

関連論文リスト