Fugu-MT 論文翻訳(概要): Need We Teach Foundation Models What is a Generative Image? Gradient-Free Generative Artifact Detection via Analytic Spectral Adaptation

論文の概要: Need We Teach Foundation Models What is a Generative Image? Gradient-Free Generative Artifact Detection via Analytic Spectral Adaptation

arxiv url: http://arxiv.org/abs/2606.07660v1
Date: Wed, 03 Jun 2026 13:04:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-09 14:42:05.202713
Title: Need We Teach Foundation Models What is a Generative Image? Gradient-Free Generative Artifact Detection via Analytic Spectral Adaptation
Title（参考訳）: ファウンデーションモデルを教える必要がある生成画像とは何か? 分析スペクトル適応によるグラディエントフリー生成アーティファクト検出
Authors: Qiaoyu Chen, Bing Zhang,
Abstract要約: 勾配ベースの更新による生成成果物の検出に基礎モデルを適用することで、本質的な表現が損なわれる。本稿では,二分分類から外分布(OOD)異常測定問題への検出を緩和する勾配のない手法を提案する。
参考スコア（独自算出の注目度）: 3.881211374324378
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adapting foundation models to detect generative artifacts via gradient-based updates compromises their intrinsic representations. Under optimization on limited samples, models overfit to local domain shortcuts. Fine-tuning massive weights on specialized data introduces erroneous inductive biases, inducing a measurable $\mathcal{L}_2$ norm perturbation in the high-dimensional feature space -- a phenomenon we formalize as anchor drift. Amplified by nonlinear activations, this drift impairs zero-shot forgery detection across unseen domains.We propose a gradient-free methodology reframing detection from binary classification to an out-of-distribution (OOD) anomaly measurement problem. Treating a frozen foundation model as a stable coordinate system, we establish an absolute natural anchor on the real visual manifold by analytically decoupling statistical and semantic deviations, derived from attention-weighted spatial moments and orthogonal projection of perceptual inconsistencies. Evaluated in an extreme zero-shot setting (trained on face forgeries, tested on universal Text-to-Image generations), our method significantly outperforms gradient-optimized paradigms. Backpropagation-free forward passes and linear solvers enable hardware-agnostic, edge-deployable calibration with minimal latency. Furthermore, the Sherman-Morrison formula unlocks instantaneous online learning against novel attacks and enables privacy-preserving federated collaboration via covariance delta transmission.
Abstract（参考訳）: 勾配ベースの更新による生成成果物の検出に基礎モデルを適用することで、本質的な表現が損なわれる。限られたサンプルの最適化では、モデルはローカルドメインのショートカットに過度に適合する。特殊データに対する微調整の大きな重みは誤った帰納バイアスをもたらし、高次元の特徴空間における測定可能な$\mathcal{L}_2$ノルム摂動を誘導する。非線形アクティベーションによって増幅されたこのドリフトは、未確認領域間のゼロショットフォージェリ検出を阻害し、二分法分類からアウト・オブ・ディストリビューション(OOD)異常測定問題への検出を緩和する勾配のない手法を提案する。凍結基礎モデルを安定座標系として扱うことにより,注意重み付き空間モーメントと直交的な知覚的不整合の予測から,統計的および意味的偏差を解析的に分離することにより,実視覚多様体に絶対的な自然なアンカーを確立する。極端ゼロショット設定(顔の偽造で訓練され、普遍的なテキスト・ツー・イメージ世代でテストされる)で評価し、勾配最適化パラダイムを著しく上回っている。バックプロパゲーションフリーのフォワードパスとリニアソルバは、最小レイテンシでハードウェアに依存しないエッジデプロイ可能なキャリブレーションを可能にする。さらに、シャーマン・モリソンの公式は、新しい攻撃に対する即時オンライン学習を解放し、共分散デルタ送信によるプライバシー保護された連携を可能にする。

論文の概要: Need We Teach Foundation Models What is a Generative Image? Gradient-Free Generative Artifact Detection via Analytic Spectral Adaptation

関連論文リスト