Fugu-MT 論文翻訳(概要): Avoid Catastrophic Forgetting with Rank-1 Fisher from Diffusion Models

論文の概要: Avoid Catastrophic Forgetting with Rank-1 Fisher from Diffusion Models

arxiv url: http://arxiv.org/abs/2509.23593v1
Date: Sun, 28 Sep 2025 02:51:16 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-30 22:32:19.311364
Title: Avoid Catastrophic Forgetting with Rank-1 Fisher from Diffusion Models
Title（参考訳）: 拡散モデルによるRan-1漁業の破砕防止
Authors: Zekun Wang, Anant Gupta, Zihan Dong, Christopher J. MacLellan,
Abstract要約: 破滅的な忘れは、ニューラルモデルにおける継続的な学習の中心的な障害である。我々は,すでに高品質な再生データを生成することができる拡散モデルの勾配幾何学について検討する。 EWCのランク1変種は、対角近似と同等に安価であるが、支配的な曲率方向を捉えている。
参考スコア（独自算出の注目度）: 5.0834716824529105
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Catastrophic forgetting remains a central obstacle for continual learning in neural models. Popular approaches -- replay and elastic weight consolidation (EWC) -- have limitations: replay requires a strong generator and is prone to distributional drift, while EWC implicitly assumes a shared optimum across tasks and typically uses a diagonal Fisher approximation. In this work, we study the gradient geometry of diffusion models, which can already produce high-quality replay data. We provide theoretical and empirical evidence that, in the low signal-to-noise ratio (SNR) regime, per-sample gradients become strongly collinear, yielding an empirical Fisher that is effectively rank-1 and aligned with the mean gradient. Leveraging this structure, we propose a rank-1 variant of EWC that is as cheap as the diagonal approximation yet captures the dominant curvature direction. We pair this penalty with a replay-based approach to encourage parameter sharing across tasks while mitigating drift. On class-incremental image generation datasets (MNIST, FashionMNIST, CIFAR-10, ImageNet-1k), our method consistently improves average FID and reduces forgetting relative to replay-only and diagonal-EWC baselines. In particular, forgetting is nearly eliminated on MNIST and FashionMNIST and is roughly halved on ImageNet-1k. These results suggest that diffusion models admit an approximately rank-1 Fisher. With a better Fisher estimate, EWC becomes a strong complement to replay: replay encourages parameter sharing across tasks, while EWC effectively constrains replay-induced drift.
Abstract（参考訳）: 破滅的な忘れは、ニューラルモデルにおける継続的な学習の中心的な障害である。リプレイには強力なジェネレータが必要で、分散的なドリフトが困難であるのに対して、EWCはタスク間で共有最適化を暗黙的に仮定し、通常対角線フィッシャー近似を使用する。本研究では,すでに高品質な再生データを生成することができる拡散モデルの勾配幾何学について検討する。我々は,低信号-雑音比(SNR)法において,サンプルごとの勾配が強くコリニアとなり,効果的にランク1で平均勾配に整合した経験的フィッシャーが得られるという理論的および経験的証拠を提供する。この構造を応用して、対角近似と同じくらい安価であるが、支配的な曲率方向を捉えるEWCのランク1変種を提案する。このペナルティをリプレイベースのアプローチと組み合わせて、ドリフトを緩和しながらタスク間のパラメータ共有を促進する。クラスインクリメンタル画像生成データセット(MNIST, FashionMNIST, CIFAR-10, ImageNet-1k)では, 平均FIDが一貫して改善され, リプレイ専用および対角型EWCベースラインに対する忘れ込みが減少する。特に、MNISTとFashionMNISTでは忘れられ、ImageNet-1kではほぼ半分になっている。これらの結果は拡散モデルにおよそ1階のフィッシャーが存在することを示唆している。リプレイはタスク間のパラメータ共有を奨励し、EWCはリプレイによって引き起こされるドリフトを効果的に制限する。

論文の概要: Avoid Catastrophic Forgetting with Rank-1 Fisher from Diffusion Models

関連論文リスト