Fugu-MT 論文翻訳(概要): Reducing Diffusion Model Memorization with Higher Order Langevin Dynamics

論文の概要: Reducing Diffusion Model Memorization with Higher Order Langevin Dynamics

arxiv url: http://arxiv.org/abs/2605.19170v2
Date: Sat, 23 May 2026 19:16:25 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 16:32:37.743824
Title: Reducing Diffusion Model Memorization with Higher Order Langevin Dynamics
Title（参考訳）: 高次ランゲヴィンダイナミクスによる拡散モデル記憶の低減
Authors: Benjamin Sterling, Mónica F. Bugallo, Tom Tirer,
Abstract要約: 拡散/スコアベースのモデルは、トレーニングデータ分布を模倣する高品質なサンプルを生成することができる。著作権とプライバシを潜在的に侵害する「記憶化」として知られるトレーニングサンプルを再現する傾向が指摘されている。この現象に対する高次ランゲヴィンダイナミクス(HOLD)の効果について検討した。
参考スコア（独自算出の注目度）: 22.777011957535255
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion/score-based models have emerged as powerful generative models, capable of generating high-quality samples that mimic the training data distribution. However, it has been observed that they are prone to reproducing training samples-known as "memorization"-potentially violating copyright and privacy. In this paper, we study the effect of Higher-Order Langevin Dynamics (HOLD) on this phenomenon. HOLD diffusion processes introduce auxiliary variables; if the data variable is interpreted as "position," then the auxiliary variables can be interpreted as "velocity" and "acceleration," depending on the chosen order of the model. They were originally proposed based on the intuition that they regularize the trajectories of the data variable by implicitly imposing additional dynamical constraints. Our work provides, to our knowledge, the first theoretical characterization of the regularization effect of HOLD. Specifically, we show that in HOLD, the dynamics of the data variable are governed by a low-pass-filtered version of the learned score function, with smoothness increasing with the order of HOLD. We then analyze the optimal empirical score and the possibility of distribution collapse. Together, our results explain the mitigation of memorization as the model order increases. Finally, we present an empirical study on real-world data that supports our theory and highlights this distinct advantage of HOLD over standard diffusion in practice.
Abstract（参考訳）: 拡散/スコアベースモデルが強力な生成モデルとして登場し、トレーニングデータ分布を模倣する高品質なサンプルを生成することができる。しかし、著作権とプライバシーを潜在的に侵害する「記憶化」として知られるトレーニングサンプルを再現する傾向が指摘されている。本稿では,この現象に対する高次ランゲヴィンダイナミクス(HOLD)の効果について検討する。 HOLD拡散過程は補助変数を導入し、データ変数を「配置」と解釈すると、補助変数はモデルの選択順序に応じて「速度」と「加速」と解釈できる。これらは、データ変数の軌跡を暗黙的に動的制約を課すことによって規則化するという直感に基づいて提案された。我々の研究は、私たちの知る限り、HOLDの正則化効果の最初の理論的特徴を提供する。具体的には、HOLDにおいて、データ変数のダイナミクスは、学習されたスコア関数の低パスフィルタバージョンによって制御され、HOLDの順に滑らかさが増加することを示す。次に,最適経験値と分布崩壊の可能性を分析する。この結果から,モデル順序の増大に伴う記憶の緩和が説明できる。最後に、我々の理論を支持する実世界のデータに関する実証的研究を行い、実際の標準拡散に対するHOLDの明確な優位性を強調した。

論文の概要: Reducing Diffusion Model Memorization with Higher Order Langevin Dynamics

関連論文リスト