Addressing Negative Transfer in Diffusion Models
- URL: http://arxiv.org/abs/2306.00354v3
- Date: Sat, 30 Dec 2023 13:50:02 GMT
- Title: Addressing Negative Transfer in Diffusion Models
- Authors: Hyojun Go, JinYoung Kim, Yunsung Lee, Seunghyun Lee, Shinhyeok Oh,
Hyeongdon Moon, Seungtaek Choi
- Abstract summary: Multi-task learning (MTL) can lead to negative transfer in diffusion models.
We propose clustering the denoising tasks into small task clusters and applying MTL methods to them.
We show that interval clustering can be solved using dynamic programming, utilizing signal-to-noise ratio, timestep, and task affinity for clustering objectives.
- Score: 25.457422246404853
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion-based generative models have achieved remarkable success in various
domains. It trains a shared model on denoising tasks that encompass different
noise levels simultaneously, representing a form of multi-task learning (MTL).
However, analyzing and improving diffusion models from an MTL perspective
remains under-explored. In particular, MTL can sometimes lead to the well-known
phenomenon of negative transfer, which results in the performance degradation
of certain tasks due to conflicts between tasks. In this paper, we first aim to
analyze diffusion training from an MTL standpoint, presenting two key
observations: (O1) the task affinity between denoising tasks diminishes as the
gap between noise levels widens, and (O2) negative transfer can arise even in
diffusion training. Building upon these observations, we aim to enhance
diffusion training by mitigating negative transfer. To achieve this, we propose
leveraging existing MTL methods, but the presence of a huge number of denoising
tasks makes this computationally expensive to calculate the necessary per-task
loss or gradient. To address this challenge, we propose clustering the
denoising tasks into small task clusters and applying MTL methods to them.
Specifically, based on (O2), we employ interval clustering to enforce temporal
proximity among denoising tasks within clusters. We show that interval
clustering can be solved using dynamic programming, utilizing signal-to-noise
ratio, timestep, and task affinity for clustering objectives. Through this, our
approach addresses the issue of negative transfer in diffusion models by
allowing for efficient computation of MTL methods. We validate the efficacy of
proposed clustering and its integration with MTL methods through various
experiments, demonstrating 1) improved generation quality and 2) faster
training convergence of diffusion models.
Related papers
- SGW-based Multi-Task Learning in Vision Tasks [8.459976488960269]
As the scale of datasets expands and the complexity of tasks increases, knowledge sharing becomes increasingly challenging.
We propose an information bottleneck knowledge extraction module (KEM)
This module aims to reduce inter-task interference by constraining the flow of information, thereby reducing computational complexity.
arXiv Detail & Related papers (2024-10-03T13:56:50Z) - Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data [16.501973201535442]
We reformulate the partially-labeled multi-task dense prediction as a pixel-level denoising problem.
We propose a novel multi-task denoising framework coined as DiffusionMTL.
It designs a joint diffusion and denoising paradigm to model a potential noisy distribution in the task prediction or feature maps.
arXiv Detail & Related papers (2024-03-22T17:59:58Z) - Denoising Task Routing for Diffusion Models [19.373733104929325]
Diffusion models generate highly realistic images by learning a multi-step denoising process.
Despite the inherent connection between diffusion models and multi-task learning (MTL), there remains an unexplored area in designing neural architectures.
We present Denoising Task Routing (DTR), a simple add-on strategy for existing diffusion model architectures to establish distinct information pathways.
arXiv Detail & Related papers (2023-10-11T02:23:18Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - DiffusionTrack: Diffusion Model For Multi-Object Tracking [15.025051933538043]
Multi-object tracking (MOT) is a challenging vision task that aims to detect individual objects within a single frame and associate them across multiple frames.
Recent MOT approaches can be categorized into two-stage tracking-by-detection (TBD) methods and one-stage joint detection and tracking (JDT) methods.
We propose a simple but robust framework that formulates object detection and association jointly as a consistent denoising diffusion process.
arXiv Detail & Related papers (2023-08-19T04:48:41Z) - Semi-Implicit Denoising Diffusion Models (SIDDMs) [50.30163684539586]
Existing models such as Denoising Diffusion Probabilistic Models (DDPM) deliver high-quality, diverse samples but are slowed by an inherently high number of iterative steps.
We introduce a novel approach that tackles the problem by matching implicit and explicit factors.
We demonstrate that our proposed method obtains comparable generative performance to diffusion-based models and vastly superior results to models with a small number of sampling steps.
arXiv Detail & Related papers (2023-06-21T18:49:22Z) - Mitigating Negative Transfer in Multi-Task Learning with Exponential
Moving Average Loss Weighting Strategies [0.981328290471248]
Multi-Task Learning (MTL) is a growing subject of interest in deep learning.
MTL can be impractical as certain tasks can dominate training and hurt performance in others.
We propose techniques for loss balancing based on scaling by the exponential moving average.
arXiv Detail & Related papers (2022-11-22T09:22:48Z) - Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of
Semantics and Depth [83.94528876742096]
We tackle the MTL problem of two dense tasks, ie, semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module (CCAM)
In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called AffineMix, and a simple depth augmentation using predicted semantics called ColorAug.
Finally, we validate the performance gain of the proposed method on the Cityscapes dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic
arXiv Detail & Related papers (2022-06-21T17:40:55Z) - Multi-Task Learning as a Bargaining Game [63.49888996291245]
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks.
Since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts.
We propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
arXiv Detail & Related papers (2022-02-02T13:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.