Assessing Open-world Forgetting in Generative Image Model Customization
- URL: http://arxiv.org/abs/2410.14159v2
- Date: Wed, 05 Feb 2025 13:06:11 GMT
- Title: Assessing Open-world Forgetting in Generative Image Model Customization
- Authors: Héctor Laria, Alex Gomez-Villa, Kai Wang, Bogdan Raducanu, Joost van de Weijer,
- Abstract summary: We introduce the concept of open-world forgetting to characterize the vast scope of unintended alterations.
We show that even minor model adaptations can lead to significant semantic drift affecting areas far beyond newly introduced concepts.
We propose a functional regularization strategy that effectively preserves original capabilities while accommodating new concepts.
- Score: 18.246389150176665
- License:
- Abstract: Recent advances in diffusion models have significantly enhanced image generation capabilities. However, customizing these models with new classes often leads to unintended consequences that compromise their reliability. We introduce the concept of open-world forgetting to characterize the vast scope of these unintended alterations. Our work presents the first systematic investigation into open-world forgetting in diffusion models, focusing on semantic and appearance drift of representations. Using zero-shot classification, we demonstrate that even minor model adaptations can lead to significant semantic drift affecting areas far beyond newly introduced concepts, with accuracy drops of up to 60% on previously learned concepts. Our analysis of appearance drift reveals substantial changes in texture and color distributions of generated content. To address these issues, we propose a functional regularization strategy that effectively preserves original capabilities while accommodating new concepts. Through extensive experiments across multiple datasets and evaluation metrics, we demonstrate that our approach significantly reduces both semantic and appearance drift. Our study highlights the importance of considering open-world forgetting in future research on model customization and finetuning methods.
Related papers
- A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on.
Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions.
The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z) - What Matters When Repurposing Diffusion Models for General Dense Perception Tasks? [49.84679952948808]
Recent works show promising results by simply fine-tuning T2I diffusion models for dense perception tasks.
We conduct a thorough investigation into critical factors that affect transfer efficiency and performance when using diffusion priors.
Our work culminates in the development of GenPercept, an effective deterministic one-step fine-tuning paradigm tailed for dense visual perception tasks.
arXiv Detail & Related papers (2024-03-10T04:23:24Z) - Bridging Generative and Discriminative Models for Unified Visual
Perception with Diffusion Priors [56.82596340418697]
We propose a simple yet effective framework comprising a pre-trained Stable Diffusion (SD) model containing rich generative priors, a unified head (U-head) capable of integrating hierarchical representations, and an adapted expert providing discriminative priors.
Comprehensive investigations unveil potential characteristics of Vermouth, such as varying granularity of perception concealed in latent variables at distinct time steps and various U-net stages.
The promising results demonstrate the potential of diffusion models as formidable learners, establishing their significance in furnishing informative and robust visual representations.
arXiv Detail & Related papers (2024-01-29T10:36:57Z) - Demystifying Variational Diffusion Models [23.601173340762074]
We present a more straightforward introduction to diffusion models using directed graphical modelling and variational Bayesian principles.
Our exposition constitutes a comprehensive technical review spanning from foundational concepts like deep latent variable models to recent advances in continuous-time diffusion-based modelling.
We provide additional mathematical insights that were omitted in the seminal works whenever possible to aid in understanding, while avoiding the introduction of new notation.
arXiv Detail & Related papers (2024-01-11T22:37:37Z) - Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - Diffusion Models for Image Restoration and Enhancement -- A
Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration.
We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR.
We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z) - Mitigating Bias: Enhancing Image Classification by Improving Model
Explanations [9.791305104409057]
Deep learning models tend to rely heavily on simple and easily discernible features in the background of images.
We introduce a mechanism that encourages the model to allocate sufficient attention to the foreground.
Our findings highlight the importance of foreground attention in enhancing model understanding and representation of the main concepts within images.
arXiv Detail & Related papers (2023-07-04T04:46:44Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z) - Embracing New Techniques in Deep Learning for Estimating Image
Memorability [0.0]
We propose and evaluate five alternative deep learning models to predict image memorability.
Our findings suggest that the key prior memorability network had overstated its generalizability and was overfit on its training set.
We make our new state-of-the-art model readily available to the research community, allowing memory researchers to make predictions about memorability on a wider range of images.
arXiv Detail & Related papers (2021-05-21T23:05:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.