Related papers: Beyond Synthetic Replays: Turning Diffusion Features into Few-Shot Class-Incremental Learning Knowledge

Beyond Synthetic Replays: Turning Diffusion Features into Few-Shot Class-Incremental Learning Knowledge

URL: http://arxiv.org/abs/2503.23402v2
Date: Sat, 27 Sep 2025 10:31:59 GMT
Title: Beyond Synthetic Replays: Turning Diffusion Features into Few-Shot Class-Incremental Learning Knowledge
Authors: Junsu Kim, Yunhoe Ku, Dongyoon Han, Seungryul Baek,
Abstract summary: Few-shot class-incremental learning (FSCIL) is challenging due to extremely limited training data.<n>Recent works have explored generative models, particularly Stable Diffusion (SD) to address these challenges.<n>We introduce Diffusion-FSCIL, which extracts four synergistic feature types from SD by capturing real image characteristics.
Score: 36.22704733553466
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Few-shot class-incremental learning (FSCIL) is challenging due to extremely limited training data while requiring models to acquire new knowledge without catastrophic forgetting. Recent works have explored generative models, particularly Stable Diffusion (SD), to address these challenges. However, existing approaches use SD mainly as a replay generator, whereas we demonstrate that SD's rich multi-scale representations can serve as a unified backbone. Motivated by this observation, we introduce Diffusion-FSCIL, which extracts four synergistic feature types from SD by capturing real image characteristics through inversion, providing semantic diversity via class-conditioned synthesis, enhancing generalization through controlled noise injection, and enabling replay without image storage through generative features. Unlike conventional approaches requiring synthetic buffers and separate classification backbones, our unified framework operates entirely in the latent space with only lightweight networks ($\approx$6M parameters). Extensive experiments on CUB-200, miniImageNet, and CIFAR-100 demonstrate state-of-the-art performance, with comprehensive ablations confirming the necessity of each feature type. Furthermore, we confirm that our streamlined variant maintains competitive accuracy while substantially improving efficiency, establishing the viability of generative models as practical and effective backbones for FSCIL.

Related papers

DAV-GSWT: Diffusion-Active-View Sampling for Data-Efficient Gaussian Splatting Wang Tiles [9.641815204004823]
3D Gaussian Splatting has redefined the capabilities of photorealistic neural rendering.<n>DAV-GSWT is a framework that leverages diffusion priors and active view sampling to synthesize high-fidelity Wang Tiles.<n>Our system significantly reduces the required data volume while maintaining the visual integrity and interactive performance necessary for large-scale virtual environments.
arXiv Detail & Related papers (2026-02-17T04:47:39Z)
Adapting Multimodal Foundation Models for Few-Shot Learning: A Comprehensive Study on Contrastive Captioners [1.2461503242570642]
This paper presents a study on adapting the Contrastive Captioners (CoCa) visual backbone for few-shot image classification.<n>We identify an "augmentation divergence": while strong data augmentation degrades the performance of linear probing in low-shot settings, it is essential for stabilizing LoRA fine-tuning.<n>We also demonstrate that hybrid objectives incorporating Supervised Contrastive (SupCon) loss yield consistent performance improvements over standard Cross-Entropy.
arXiv Detail & Related papers (2025-12-14T20:13:21Z)
Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL [19.094835780362775]
Few-Shot Class-Incremental Learning (FSCIL) challenges models to sequentially learn new classes from minimal examples.<n>Current FSCIL methods often struggle with generalization due to their reliance on limited datasets.<n>This paper introduces Diffusion-Classifier Synergy (DCS), a novel framework that establishes a mutual boosting loop between diffusion model and FSCIL classifier.
arXiv Detail & Related papers (2025-10-04T01:48:52Z)
Learning Robust Diffusion Models from Imprecise Supervision [75.53546939251146]
DMIS is a unified framework for training robust Conditional Diffusion Models from Imprecise Supervision.<n>Our framework is derived from likelihood and decomposes the objective into generative and classification components.<n>Experiments on diverse forms of imprecise supervision, covering tasks covering image generation, weakly supervised learning, and dataset condensation demonstrate that DMIS consistently produces high-quality and class-discriminative samples.
arXiv Detail & Related papers (2025-10-03T14:00:32Z)
Can Synthetic Images Conquer Forgetting? Beyond Unexplored Doubts in Few-Shot Class-Incremental Learning [9.73590544210575]
Few-shot class-incremental learning (FSCIL) is challenging due to extremely limited training data.<n>We propose Diffusion-FSCIL, a novel approach that employs a text-to-image diffusion model as a frozen backbone.
arXiv Detail & Related papers (2025-07-18T08:38:07Z)
FreSca: Scaling in Frequency Space Enhances Diffusion Models [55.75504192166779]
This paper explores frequency-based control within latent diffusion models.<n>We introduce FreSca, a novel framework that decomposes noise difference into low- and high-frequency components.<n>FreSca operates without any model retraining or architectural change, offering model- and task-agnostic control.
arXiv Detail & Related papers (2025-04-02T22:03:11Z)
Masked Autoencoders Are Effective Tokenizers for Diffusion Models [56.08109308294133]
MAETok is an autoencoder that learns semantically rich latent space while maintaining reconstruction fidelity.<n>MaETok achieves significant practical improvements, enabling a gFID of 1.69 with 76x faster training and 31x higher inference throughput for 512x512 generation.
arXiv Detail & Related papers (2025-02-05T18:42:04Z)
Exploring Representation-Aligned Latent Space for Better Generation [86.45670422239317]
We introduce ReaLS, which integrates semantic priors to improve generation performance.<n>We show that fundamental DiT and SiT trained on ReaLS can achieve a 15% improvement in FID metric.<n>The enhanced semantic latent space enables more perceptual downstream tasks, such as segmentation and depth estimation.
arXiv Detail & Related papers (2025-02-01T07:42:12Z)
LayerMix: Enhanced Data Augmentation through Fractal Integration for Robust Deep Learning [1.786053901581251]
Deep learning models often struggle to maintain consistent performance when confronted with Out-of-Distribution (OOD) samples.<n>We introduce LayerMix, an innovative data augmentation approach that systematically enhances model robustness.<n>Our method generates semantically consistent synthetic samples that significantly improve neural network generalization capabilities.
arXiv Detail & Related papers (2025-01-08T22:22:44Z)
Diffusion Model Meets Non-Exemplar Class-Incremental Learning and Beyond [48.51784137032964]
Non-exemplar class-incremental learning (NECIL) is to resist catastrophic forgetting without saving old class samples. We propose a simple, yet effective textbfDiffusion-based textbfFeature textbfReplay (textbfDiffFR) method for NECIL.
arXiv Detail & Related papers (2024-08-06T06:33:24Z)
Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors [56.82596340418697]
We propose a simple yet effective framework comprising a pre-trained Stable Diffusion (SD) model containing rich generative priors, a unified head (U-head) capable of integrating hierarchical representations, and an adapted expert providing discriminative priors. Comprehensive investigations unveil potential characteristics of Vermouth, such as varying granularity of perception concealed in latent variables at distinct time steps and various U-net stages. The promising results demonstrate the potential of diffusion models as formidable learners, establishing their significance in furnishing informative and robust visual representations.
arXiv Detail & Related papers (2024-01-29T10:36:57Z)
Learning Prompt with Distribution-Based Feature Replay for Few-Shot Class-Incremental Learning [56.29097276129473]
We propose a simple yet effective framework, named Learning Prompt with Distribution-based Feature Replay (LP-DiF)<n>To prevent the learnable prompt from forgetting old knowledge in the new session, we propose a pseudo-feature replay approach.<n>When progressing to a new session, pseudo-features are sampled from old-class distributions combined with training images of the current session to optimize the prompt.
arXiv Detail & Related papers (2024-01-03T07:59:17Z)
Reverse Stable Diffusion: What prompt was used to generate this image? [73.10116197883303]
We study the task of predicting the prompt embedding given an image generated by a generative diffusion model. We propose a novel learning framework comprising a joint prompt regression and multi-label vocabulary classification objective. We conduct experiments on the DiffusionDB data set, predicting text prompts from images generated by Stable Diffusion.
arXiv Detail & Related papers (2023-08-02T23:39:29Z)
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images. Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few. We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
Structural Pruning for Diffusion Models [65.02607075556742]
We present Diff-Pruning, an efficient compression method tailored for learning lightweight diffusion models from pre-existing ones. Our empirical assessment, undertaken across several datasets highlights two primary benefits of our proposed method.
arXiv Detail & Related papers (2023-05-18T12:38:21Z)
Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification. Our generative approach to classification attains strong results on a variety of benchmarks. Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z)
DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery [20.787180028571694]
DiffusionSeg is a synthesis-exploitation framework containing two-stage strategies. We synthesize abundant images, and propose a novel training-free AttentionCut to obtain masks in the first stage. In the second exploitation stage, to bridge the structural gap, we use the inversion technique, to map the given image back to diffusion features.
arXiv Detail & Related papers (2023-03-17T07:47:55Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
Few-shot learning via tensor hallucination [17.381648488344222]
Few-shot classification addresses the challenge of classifying examples given only limited labeled data. We show that using a simple loss function is more than enough for training a feature generator in the few-shot setting. Our method sets a new state of the art, outperforming more sophisticated few-shot data augmentation methods.
arXiv Detail & Related papers (2021-04-19T17:30:33Z)
Self-Regression Learning for Blind Hyperspectral Image Fusion Without Label [11.291055330647977]
We propose a self-regression learning method that reconstructs hyperspectral image (HSI) and estimate the observation model. In particular, we adopt an invertible neural network (INN) for restoring the HSI, and two fully-connected networks (FCN) for estimating the observation model. Our model can outperform the state-of-the-art methods in experiments on both synthetic and real-world dataset.
arXiv Detail & Related papers (2021-03-31T04:48:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.