Related papers: MuseumMaker: Continual Style Customization without Catastrophic Forgetting

MuseumMaker: Continual Style Customization without Catastrophic Forgetting

URL: http://arxiv.org/abs/2404.16612v2
Date: Mon, 29 Apr 2024 12:08:10 GMT
Title: MuseumMaker: Continual Style Customization without Catastrophic Forgetting
Authors: Chenxi Liu, Gan Sun, Wenqi Liang, Jiahua Dong, Can Qin, Yang Cong,
Abstract summary: We propose MuseumMaker, a method that enables the synthesis of images by following a set of customized styles in a never-end manner. When facing with a new customization style, we develop a style distillation loss module to extract and learn the styles of the training data for new image generation. It can minimize the learning biases caused by content of new training images, and address the catastrophic overfitting issue induced by few-shot images.
Score: 50.12727620780213
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pre-trained large text-to-image (T2I) models with an appropriate text prompt has attracted growing interests in customized images generation field. However, catastrophic forgetting issue make it hard to continually synthesize new user-provided styles while retaining the satisfying results amongst learned styles. In this paper, we propose MuseumMaker, a method that enables the synthesis of images by following a set of customized styles in a never-end manner, and gradually accumulate these creative artistic works as a Museum. When facing with a new customization style, we develop a style distillation loss module to extract and learn the styles of the training data for new image generation. It can minimize the learning biases caused by content of new training images, and address the catastrophic overfitting issue induced by few-shot images. To deal with catastrophic forgetting amongst past learned styles, we devise a dual regularization for shared-LoRA module to optimize the direction of model update, which could regularize the diffusion model from both weight and feature aspects, respectively. Meanwhile, to further preserve historical knowledge from past styles and address the limited representability of LoRA, we consider a task-wise token learning module where a unique token embedding is learned to denote a new style. As any new user-provided style come, our MuseumMaker can capture the nuances of the new styles while maintaining the details of learned styles. Experimental results on diverse style datasets validate the effectiveness of our proposed MuseumMaker method, showcasing its robustness and versatility across various scenarios.

Related papers

Pluggable Style Representation Learning for Multi-Style Transfer [41.09041735653436]
We develop a style transfer framework by decoupling the style modeling and transferring. For style modeling, we propose a style representation learning scheme to encode the style information into a compact representation. For style transferring, we develop a style-aware multi-style transfer network (SaMST) to adapt to diverse styles using pluggable style representations.
arXiv Detail & Related papers (2025-03-26T09:44:40Z)
IntroStyle: Training-Free Introspective Style Attribution using Diffusion Features [89.95303251220734]
We present a training-free framework to solve the style attribution problem, using the features produced by a diffusion model alone. This is denoted as introspective style attribution (IntroStyle) and demonstrates superior performance to state-of-the-art models for style retrieval. We also introduce a synthetic dataset of Style Hacks (SHacks) to isolate artistic style and evaluate fine-grained style attribution performance.
arXiv Detail & Related papers (2024-12-19T01:21:23Z)
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation [70.95783968368124]
We introduce a novel multi-modal autoregressive model, dubbed $textbfInstaManip$. We propose an innovative group self-attention mechanism to break down the in-context learning process into two separate stages. Our method surpasses previous few-shot image manipulation models by a notable margin.
arXiv Detail & Related papers (2024-12-02T01:19:21Z)
Customizing Text-to-Image Models with a Single Image Pair [47.49970731632113]
Art reinterpretation is the practice of creating a variation of a reference work, making a paired artwork that exhibits a distinct artistic style. We propose Pair Customization, a new customization method that learns stylistic difference from a single image pair and then applies the acquired style to the generation process.
arXiv Detail & Related papers (2024-05-02T17:59:52Z)
Measuring Style Similarity in Diffusion Models [118.22433042873136]
We present a framework for understanding and extracting style descriptors from images. Our framework comprises a new dataset curated using the insight that style is a subjective property of an image. We also propose a method to extract style attribute descriptors that can be used to style of a generated image to the images used in the training dataset of a text-to-image model.
arXiv Detail & Related papers (2024-04-01T17:58:30Z)
Implicit Style-Content Separation using B-LoRA [61.664293840163865]
We introduce B-LoRA, a method that implicitly separate the style and content components of a single image. By analyzing the architecture of SDXL combined with LoRA, we find that jointly learning the LoRA weights of two specific blocks achieves style-content separation.
arXiv Detail & Related papers (2024-03-21T17:20:21Z)
ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank [9.99530386586636]
Artistic style transfer aims to repaint the content image with the learned artistic style. Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches. We propose ArtBank, a novel artistic style transfer framework, to generate highly realistic stylized images.
arXiv Detail & Related papers (2023-12-11T05:53:40Z)
DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer [27.39248034592382]
We propose using a new class of models to perform style transfer while enabling deformable style transfer. We show how leveraging the priors of these models can expose new artistic controls at inference time.
arXiv Detail & Related papers (2023-07-09T12:13:43Z)
StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning [89.86971464234533]
Cross-Domain Few-Shot Learning (CD-FSL) is a recently emerging task that tackles few-shot learning across different domains. We propose a novel model-agnostic meta Style Adversarial training (StyleAdv) method together with a novel style adversarial attack method. Our method is gradually robust to the visual styles, thus boosting the generalization ability for novel target datasets.
arXiv Detail & Related papers (2023-02-18T11:54:37Z)
Style-Agnostic Reinforcement Learning [9.338454092492901]
We present a novel method of learning style-agnostic representation using both style transfer and adversarial learning. Our method trains the actor with diverse image styles generated from an inherent adversarial style generator. We verify that our method achieves competitive or better performances than the state-of-the-art approaches on Procgen and Distracting Control Suite benchmarks.
arXiv Detail & Related papers (2022-08-31T13:45:00Z)
Learning Diverse Tone Styles for Image Retouching [73.60013618215328]
We propose to learn diverse image retouching with normalizing flow-based architectures. A joint-training pipeline is composed of a style encoder, a conditional RetouchNet, and the image tone style normalizing flow (TSFlow) module. Our proposed method performs favorably against state-of-the-art methods and is effective in generating diverse results.
arXiv Detail & Related papers (2022-07-12T09:49:21Z)
Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning. Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z)
StyleAugment: Learning Texture De-biased Representations by Style Augmentation without Pre-defined Textures [7.81768535871051]
Recently powerful vision classifiers are biased towards textures, while shape information is overlooked by the models. A simple attempt by augmenting training images using the artistic style transfer method, called Stylized ImageNet, can reduce the texture bias. However, Stylized ImageNet approach has two drawbacks in fidelity and diversity. We propose a StyleAugment by augmenting styles from the mini-batch.
arXiv Detail & Related papers (2021-08-24T07:17:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.