How can Diffusion Models Evolve into Continual Generators?
- URL: http://arxiv.org/abs/2505.11936v2
- Date: Thu, 05 Jun 2025 18:36:13 GMT
- Title: How can Diffusion Models Evolve into Continual Generators?
- Authors: Jingren Liu, Zhong Ji, Xiangyu Chen,
- Abstract summary: Continual Consistency Diffusion (CCD) is a principled framework that integrates consistency objectives into training.<n>CCD achieves state-of-the-art performance under continual settings.
- Score: 22.06922342737842
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While diffusion models have achieved remarkable success in static data generation, their deployment in streaming or continual learning (CL) scenarios faces a major challenge: catastrophic forgetting (CF), where newly acquired generative capabilities overwrite previously learned ones. To systematically address this, we introduce a formal Continual Diffusion Generation (CDG) paradigm that characterizes and redefines CL in the context of generative diffusion models. Prior efforts often adapt heuristic strategies from continual classification tasks but lack alignment with the underlying diffusion process. In this work, we develop the first theoretical framework for CDG by analyzing cross-task dynamics in diffusion-based generative modeling. Our analysis reveals that the retention and stability of generative knowledge across tasks are governed by three key consistency criteria: inter-task knowledge consistency (IKC), unconditional knowledge consistency (UKC), and label knowledge consistency (LKC). Building on these insights, we propose Continual Consistency Diffusion (CCD), a principled framework that integrates these consistency objectives into training via hierarchical loss terms $\mathcal{L}_{IKC}$, $\mathcal{L}_{UKC}$, and $\mathcal{L}_{LKC}$. This promotes effective knowledge retention while enabling the assimilation of new generative capabilities. Extensive experiments on four benchmark datasets demonstrate that CCD achieves state-of-the-art performance under continual settings, with substantial gains in Mean Fidelity (MF) and Incremental Mean Fidelity (IMF), particularly in tasks with rich cross-task knowledge overlap.
Related papers
- Foundations of Diffusion Models in General State Spaces: A Self-Contained Introduction [54.95522167029998]
This article is a self-contained primer on diffusion over general state spaces.<n>We develop the discrete-time view (forward noising via Markov kernels and learned reverse dynamics) alongside its continuous-time limits.<n>A common variational treatment yields the ELBO that underpins standard training losses.
arXiv Detail & Related papers (2025-12-04T18:55:36Z) - Towards a Golden Classifier-Free Guidance Path via Foresight Fixed Point Iterations [12.366757123129402]
We propose a unified perspective that reframes conditional guidance as fixed point iterations.<n>We introduce Foresight Guidance (FSG), which prioritizes solving longer-interval subproblems in early diffusion stages.<n>Our work offers novel perspectives for conditional guidance and unlocks the potential of adaptive design.
arXiv Detail & Related papers (2025-10-24T14:39:07Z) - Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL [19.094835780362775]
Few-Shot Class-Incremental Learning (FSCIL) challenges models to sequentially learn new classes from minimal examples.<n>Current FSCIL methods often struggle with generalization due to their reliance on limited datasets.<n>This paper introduces Diffusion-Classifier Synergy (DCS), a novel framework that establishes a mutual boosting loop between diffusion model and FSCIL classifier.
arXiv Detail & Related papers (2025-10-04T01:48:52Z) - Learning Robust Diffusion Models from Imprecise Supervision [75.53546939251146]
DMIS is a unified framework for training robust Conditional Diffusion Models from Imprecise Supervision.<n>Our framework is derived from likelihood and decomposes the objective into generative and classification components.<n>Experiments on diverse forms of imprecise supervision, covering tasks covering image generation, weakly supervised learning, and dataset condensation demonstrate that DMIS consistently produces high-quality and class-discriminative samples.
arXiv Detail & Related papers (2025-10-03T14:00:32Z) - Causal Time Series Generation via Diffusion Models [96.95879410279089]
We introduce causal time series generation as a new TSG task family, formalized within Pearl's causal ladder.<n>To instantiate these tasks, we develop CaTSG, a unified diffusion-based framework.<n>Experiments on both synthetic and real-world datasets show that CaTSG achieves superior fidelity.
arXiv Detail & Related papers (2025-09-25T07:34:46Z) - Causality-aligned Prompt Learning via Diffusion-based Counterfactual Generation [45.395353088233556]
We introduce a theoretically grounded $textbfDi$ffusion-based $textbfC$ounterf$textbfa$ctual $textbfp$rompt learning framework.<n>Our method performs excellently across tasks such as image classification, image-text retrieval, and visual question answering, with particularly strong advantages in unseen categories.
arXiv Detail & Related papers (2025-07-26T09:27:52Z) - Clustering via Self-Supervised Diffusion [6.9158153233702935]
Clustering via Diffusion (CLUDI) is a self-supervised framework that combines the generative power of diffusion models with pre-trained Vision Transformer features to achieve robust and accurate clustering.<n>CLUDI is trained via a teacher-student paradigm: the teacher uses diffusion-based sampling to produce diverse cluster assignments, which the student refines into stable predictions.
arXiv Detail & Related papers (2025-07-06T07:57:08Z) - DDAE++: Enhancing Diffusion Models Towards Unified Generative and Discriminative Learning [53.27049077100897]
generative pre-training has been shown to yield discriminative representations, paving the way towards unified visual generation and understanding.<n>This work introduces self-conditioning, a mechanism that internally leverages the rich semantics inherent in denoising network to guide its own decoding layers.<n>Results are compelling: our method boosts both generation FID and recognition accuracy with 1% computational overhead and generalizes across diverse diffusion architectures.
arXiv Detail & Related papers (2025-05-16T08:47:16Z) - Channel Fingerprint Construction for Massive MIMO: A Deep Conditional Generative Approach [65.47969413708344]
We introduce the concept of CF twins and design a conditional generative diffusion model (CGDM)<n>We employ a variational inference technique to derive the evidence lower bound (ELBO) for the log-marginal distribution of the observed fine-grained CF conditioned on the coarse-grained CF.<n>We show that the proposed approach exhibits significant improvement in reconstruction performance compared to the baselines.
arXiv Detail & Related papers (2025-05-12T01:36:06Z) - Constrained Discrete Diffusion [61.81569616239755]
This paper introduces Constrained Discrete Diffusion (CDD), a novel integration of differentiable constraint optimization within the diffusion process.<n>CDD directly imposes constraints into the discrete diffusion sampling process, resulting in a training-free and effective approach.
arXiv Detail & Related papers (2025-03-12T19:48:12Z) - Freeze and Cluster: A Simple Baseline for Rehearsal-Free Continual Category Discovery [13.68907640197364]
This paper addresses the problem of Rehearsal-Free Continual Category Discovery (RF-CCD)<n>RF-CCD focuses on continuously identifying novel class by leveraging knowledge from labeled data.<n>Previous approaches have struggled to effectively integrate advanced techniques from both domains.
arXiv Detail & Related papers (2025-03-12T06:46:32Z) - Recurrent Knowledge Identification and Fusion for Language Model Continual Learning [41.901501650712234]
Recurrent-KIF is a CL framework for Recurrent Knowledge Identification and Fusion.<n>Inspired by human continual learning, Recurrent-KIF employs an inner loop that rapidly adapts to new tasks.<n> outer loop that globally manages the fusion of new and historical knowledge.
arXiv Detail & Related papers (2025-02-22T05:37:27Z) - Continual Learning Should Move Beyond Incremental Classification [51.23416308775444]
Continual learning (CL) is the sub-field of machine learning concerned with accumulating knowledge in dynamic environments.<n>Here, we argue that maintaining such a focus limits both theoretical development and practical applicability of CL methods.<n>We identify three fundamental challenges: (C1) the nature of continuity in learning problems, (C2) the choice of appropriate spaces and metrics for measuring similarity, and (C3) the role of learning objectives beyond classification.
arXiv Detail & Related papers (2025-02-17T15:40:13Z) - Parallelly Tempered Generative Adversarial Networks [7.94957965474334]
A generative adversarial network (GAN) has been a representative backbone model in generative artificial intelligence (AI)
This work analyzes the training instability and inefficiency in the presence of mode collapse by linking it to multimodality in the target distribution.
With our newly developed GAN objective function, the generator can learn all the tempered distributions simultaneously.
arXiv Detail & Related papers (2024-11-18T18:01:13Z) - Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network.
Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z) - Improving Vector-Quantized Image Modeling with Latent Consistency-Matching Diffusion [55.185588994883226]
We introduce VQ-LCMD, a continuous-space latent diffusion framework within the embedding space that stabilizes training.<n>VQ-LCMD uses a novel training objective combining the joint embedding-diffusion variational lower bound with a consistency-matching (CM) loss.<n>Experiments show that the proposed VQ-LCMD yields superior results on FFHQ, LSUN Churches, and LSUN Bedrooms compared to discrete-state latent diffusion models.
arXiv Detail & Related papers (2024-10-18T09:12:33Z) - LoRanPAC: Low-rank Random Features and Pre-trained Models for Bridging Theory and Practice in Continual Learning [103.45785408116146]
Continual learning (CL) aims to train a model that can solve multiple tasks presented sequentially.<n>Recent CL approaches have achieved strong performance by leveraging large pre-trained models that generalize well to downstream tasks.<n>However, such methods lack theoretical guarantees, making them prone to unexpected failures.<n>We aim to bridge this gap by designing a simple CL method that is theoretically sound and highly performant.
arXiv Detail & Related papers (2024-10-01T12:58:37Z) - Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective [125.00228936051657]
We introduce NTK-CL, a novel framework that eliminates task-specific parameter storage while adaptively generating task-relevant features.<n>By fine-tuning optimizable parameters with appropriate regularization, NTK-CL achieves state-of-the-art performance on established PEFT-CL benchmarks.
arXiv Detail & Related papers (2024-07-24T09:30:04Z) - Continual Learning with Dirichlet Generative-based Rehearsal [22.314195832409755]
We present Dirichlet Continual Learning, a novel generative-based rehearsal strategy for task-oriented dialogue systems.
We also introduce Jensen-Shannon Knowledge Distillation (JSKD), a robust logit-based knowledge distillation method.
Our experiments confirm the efficacy of our approach in both intent detection and slot-filling tasks, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2023-09-13T12:30:03Z) - Task-agnostic Continual Learning with Hybrid Probabilistic Models [75.01205414507243]
We propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification.
The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting.
We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.
arXiv Detail & Related papers (2021-06-24T05:19:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.