Efficient Diffusion-Driven Corruption Editor for Test-Time Adaptation
- URL: http://arxiv.org/abs/2403.10911v3
- Date: Thu, 11 Jul 2024 08:12:07 GMT
- Title: Efficient Diffusion-Driven Corruption Editor for Test-Time Adaptation
- Authors: Yeongtak Oh, Jonghyun Lee, Jooyoung Choi, Dahuin Jung, Uiwon Hwang, Sungroh Yoon,
- Abstract summary: Test-time adaptation (TTA) addresses the unforeseen distribution shifts occurring during test time.
We propose a novel TTA method that leverages an image editing model based on a latent diffusion model (LDM) and fine-tunes it using our newly introduced corruption modeling scheme.
Our model achieves the best performance with a 100 times faster runtime than that of a diffusion-based baseline.
- Score: 37.67328706787212
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Test-time adaptation (TTA) addresses the unforeseen distribution shifts occurring during test time. In TTA, performance, memory consumption, and time consumption are crucial considerations. A recent diffusion-based TTA approach for restoring corrupted images involves image-level updates. However, using pixel space diffusion significantly increases resource requirements compared to conventional model updating TTA approaches, revealing limitations as a TTA method. To address this, we propose a novel TTA method that leverages an image editing model based on a latent diffusion model (LDM) and fine-tunes it using our newly introduced corruption modeling scheme. This scheme enhances the robustness of the diffusion model against distribution shifts by creating (clean, corrupted) image pairs and fine-tuning the model to edit corrupted images into clean ones. Moreover, we introduce a distilled variant to accelerate the model for corruption editing using only 4 network function evaluations (NFEs). We extensively validated our method across various architectures and datasets including image and video domains. Our model achieves the best performance with a 100 times faster runtime than that of a diffusion-based baseline. Furthermore, it is three times faster than the previous model updating TTA method that utilizes data augmentation, making an image-level updating approach more feasible.
Related papers
- Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR)
In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks.
We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z) - Debiasing Text-to-Image Diffusion Models [84.46750441518697]
Learning-based Text-to-Image (TTI) models have revolutionized the way visual content is generated in various domains.
Recent research has shown that nonnegligible social bias exists in current state-of-the-art TTI systems.
arXiv Detail & Related papers (2024-02-22T14:33:23Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - Fixed Point Diffusion Models [13.035518953879539]
Fixed Point Diffusion Model (FPDM) is a novel approach to image generation that integrates the concept of fixed point solving into the framework of diffusion-based generative modeling.
Our approach embeds an implicit fixed point solving layer into the denoising network of a diffusion model, transforming the diffusion process into a sequence of closely-related fixed point problems.
We conduct experiments with state-of-the-art models on ImageNet, FFHQ, CelebA-HQ, and LSUN-Church, demonstrating substantial improvements in performance and efficiency.
arXiv Detail & Related papers (2024-01-16T18:55:54Z) - A-SDM: Accelerating Stable Diffusion through Redundancy Removal and
Performance Optimization [54.113083217869516]
In this work, we first explore the computational redundancy part of the network.
We then prune the redundancy blocks of the model and maintain the network performance.
Thirdly, we propose a global-regional interactive (GRI) attention to speed up the computationally intensive attention part.
arXiv Detail & Related papers (2023-12-24T15:37:47Z) - Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness.
We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z) - Blended Latent Diffusion [18.043090347648157]
We present an accelerated solution to the task of local text-driven editing of generic images, where the desired edits are confined to a user-provided mask.
Our solution leverages a recent text-to-image Latent Diffusion Model (LDM), which speeds up diffusion by operating in a lower-dimensional latent space.
arXiv Detail & Related papers (2022-06-06T17:58:04Z) - Covariance-aware Feature Alignment with Pre-computed Source Statistics
for Test-time Adaptation to Multiple Image Corruptions [11.859913430860335]
Real-world image recognition systems often face corrupted input images, which cause distribution shifts and degrade the performance of models.
Test-time adaptation (TTA) is one of the settings that can address this problem.
We propose a novel TTA method named Covariance-Aware Feature alignment (CAFe) to address the distribution gap during testing.
arXiv Detail & Related papers (2022-04-28T02:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.