Fugu-MT 論文翻訳(概要): Generative Model Inversion Through the Lens of the Manifold Hypothesis

論文の概要: Generative Model Inversion Through the Lens of the Manifold Hypothesis

arxiv url: http://arxiv.org/abs/2509.20177v1
Date: Wed, 24 Sep 2025 14:39:25 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-25 20:53:19.853343
Title: Generative Model Inversion Through the Lens of the Manifold Hypothesis
Title（参考訳）: マニフォールド仮説のレンズによる生成モデルインバージョン
Authors: Xiong Peng, Bo Han, Fengfei Yu, Tongliang Liu, Feng Liu, Mingyuan Zhou,
Abstract要約: モデル反転攻撃(MIA)は、訓練されたモデルからクラス表現型サンプルを再構成することを目的としている。最近の生成的MIAは、生成的敵ネットワークを使用して、反転過程を導く画像の事前学習を行う。
参考スコア（独自算出の注目度）: 98.37040155914595
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model inversion attacks (MIAs) aim to reconstruct class-representative samples from trained models. Recent generative MIAs utilize generative adversarial networks to learn image priors that guide the inversion process, yielding reconstructions with high visual quality and strong fidelity to the private training data. To explore the reason behind their effectiveness, we begin by examining the gradients of inversion loss with respect to synthetic inputs, and find that these gradients are surprisingly noisy. Further analysis reveals that generative inversion implicitly denoises these gradients by projecting them onto the tangent space of the generator manifold, filtering out off-manifold components while preserving informative directions aligned with the manifold. Our empirical measurements show that, in models trained with standard supervision, loss gradients often exhibit large angular deviations from the data manifold, indicating poor alignment with class-relevant directions. This observation motivates our central hypothesis: models become more vulnerable to MIAs when their loss gradients align more closely with the generator manifold. We validate this hypothesis by designing a novel training objective that explicitly promotes such alignment. Building on this insight, we further introduce a training-free approach to enhance gradient-manifold alignment during inversion, leading to consistent improvements over state-of-the-art generative MIAs.
Abstract（参考訳）: モデル反転攻撃(MIA)は、訓練されたモデルからクラス表現型サンプルを再構成することを目的としている。近年のMIAは、生成的対向ネットワークを利用して、逆転過程を導く画像の事前学習を行い、高い視覚的品質と強い忠実さで、プライベートトレーニングデータに再構成をもたらす。これらの効果の背景にある理由を探るため、まず、合成入力に対する反転損失の勾配を調べ、これらの勾配が驚くほどノイズであることを示す。さらなる解析により、生成的逆転は、これらの勾配を生成多様体の接空間に射影し、多様体に整列した情報的方向を保ちながら、非多様体成分をフィルタリングすることによって暗黙的に無視することが明らかとなる。実験により、標準監督法で訓練されたモデルでは、損失勾配はデータ多様体からの大きな角偏差を示し、クラス関連方向との整合性が低いことを示している。モデルは、損失勾配がジェネレータ多様体とより密接に一致するときに、MIAに対してより脆弱になる。このようなアライメントを明確に促進する新たなトレーニング目標を設計することで、この仮説を検証する。この知見に基づいて、インバージョン中の勾配行列アライメントを強化するためのトレーニング不要なアプローチを導入し、最先端のMIAに対して一貫した改善をもたらす。

論文の概要: Generative Model Inversion Through the Lens of the Manifold Hypothesis

関連論文リスト