Prompt-Based Exemplar Super-Compression and Regeneration for
Class-Incremental Learning
- URL: http://arxiv.org/abs/2311.18266v1
- Date: Thu, 30 Nov 2023 05:59:31 GMT
- Title: Prompt-Based Exemplar Super-Compression and Regeneration for
Class-Incremental Learning
- Authors: Ruxiao Duan, Yaoyao Liu, Jieneng Chen, Adam Kortylewski, Alan Yuille
- Abstract summary: Super-compression and regeneration method, ESCORT, substantially increases the quantity and enhances the diversity of exemplars.
To minimize the domain gap between generated exemplars and real images, we propose partial compression and diffusion-based data augmentation.
- Score: 22.676222987218555
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Replay-based methods in class-incremental learning (CIL) have attained
remarkable success, as replaying the exemplars of old classes can significantly
mitigate catastrophic forgetting. Despite their effectiveness, the inherent
memory restrictions of CIL result in saving a limited number of exemplars with
poor diversity, leading to data imbalance and overfitting issues. In this
paper, we introduce a novel exemplar super-compression and regeneration method,
ESCORT, which substantially increases the quantity and enhances the diversity
of exemplars. Rather than storing past images, we compress images into visual
and textual prompts, e.g., edge maps and class tags, and save the prompts
instead, reducing the memory usage of each exemplar to 1/24 of the original
size. In subsequent learning phases, diverse high-resolution exemplars are
generated from the prompts by a pre-trained diffusion model, e.g., ControlNet.
To minimize the domain gap between generated exemplars and real images, we
propose partial compression and diffusion-based data augmentation, allowing us
to utilize an off-the-shelf diffusion model without fine-tuning it on the
target dataset. Therefore, the same diffusion model can be downloaded whenever
it is needed, incurring no memory consumption. Comprehensive experiments
demonstrate that our method significantly improves model performance across
multiple CIL benchmarks, e.g., 5.0 percentage points higher than the previous
state-of-the-art on 10-phase Caltech-256 dataset.
Related papers
- CALLIC: Content Adaptive Learning for Lossless Image Compression [64.47244912937204]
CALLIC sets a new state-of-the-art (SOTA) for learned lossless image compression.
We propose a content-aware autoregressive self-attention mechanism by leveraging convolutional gating operations.
During encoding, we decompose pre-trained layers, including depth-wise convolutions, using low-rank matrices and then adapt the incremental weights on testing image by Rate-guided Progressive Fine-Tuning (RPFT)
RPFT fine-tunes with gradually increasing patches that are sorted in descending order by estimated entropy, optimizing learning process and reducing adaptation time.
arXiv Detail & Related papers (2024-12-23T10:41:18Z) - Hollowed Net for On-Device Personalization of Text-to-Image Diffusion Models [51.3915762595891]
This paper presents an efficient LoRA-based personalization approach for on-device subject-driven generation.
Our method, termed Hollowed Net, enhances memory efficiency during fine-tuning by modifying the architecture of a diffusion U-Net.
arXiv Detail & Related papers (2024-11-02T08:42:48Z) - Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR)
In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks.
We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z) - Probing Image Compression For Class-Incremental Learning [8.711266563753846]
Continual machine learning (ML) systems rely on storing representative samples, also known as exemplars, within a limited memory constraint to maintain the performance on previously learned data.
In this paper, we explore the use of image compression as a strategy to enhance the buffer's capacity, thereby increasing exemplar diversity.
We introduce a new framework to incorporate image compression for continual ML including a pre-processing data compression step and an efficient compression rate/algorithm selection method.
arXiv Detail & Related papers (2024-03-10T18:58:14Z) - Mitigate Replication and Copying in Diffusion Models with Generalized
Caption and Dual Fusion Enhancement [7.9911486976035215]
We introduce a generality score that measures the caption generality and employ large language model (LLM) to generalize training captions.
We leverage generalized captions and propose a novel dual fusion enhancement approach to mitigate the replication of diffusion models.
arXiv Detail & Related papers (2023-09-13T18:43:13Z) - MOFA: A Model Simplification Roadmap for Image Restoration on Mobile
Devices [17.54747506334433]
We propose a roadmap that can be applied to further accelerate image restoration models prior to deployment.
Our approach decreases runtime by up to 13% and reduces the number of parameters by up to 23%, while increasing PSNR and SSIM.
arXiv Detail & Related papers (2023-08-24T01:29:15Z) - LLDiffusion: Learning Degradation Representations in Diffusion Models
for Low-Light Image Enhancement [118.83316133601319]
Current deep learning methods for low-light image enhancement (LLIE) typically rely on pixel-wise mapping learned from paired data.
We propose a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process.
arXiv Detail & Related papers (2023-07-27T07:22:51Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - Multimodal Data Augmentation for Image Captioning using Diffusion Models [12.221685807426264]
We propose a data augmentation method, leveraging a text-to-image model called Stable Diffusion, to expand the training set.
Experiments on the MS COCO dataset demonstrate the advantages of our approach over several benchmark methods.
Further improvement regarding the training efficiency and effectiveness can be obtained after intentionally filtering the generated data.
arXiv Detail & Related papers (2023-05-03T01:57:33Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.