Related papers: Decentralized Diffusion Models

Decentralized Diffusion Models

URL: http://arxiv.org/abs/2501.05450v2
Date: Fri, 10 Jan 2025 18:58:11 GMT
Title: Decentralized Diffusion Models
Authors: David McAllister, Matthew Tancik, Jiaming Song, Angjoo Kanazawa,
Abstract summary: Large-scale AI model training divides work across thousands of GPU, then synchronizes gradients across them at each step.<n>This incurs a significant network burden that only centralized, monolithic clusters can support.<n>We propose Decentralized Diffusion Models, a scalable framework for distributing diffusion model training across independent clusters.
Score: 53.89995588977048
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large-scale AI model training divides work across thousands of GPUs, then synchronizes gradients across them at each step. This incurs a significant network burden that only centralized, monolithic clusters can support, driving up infrastructure costs and straining power systems. We propose Decentralized Diffusion Models, a scalable framework for distributing diffusion model training across independent clusters or datacenters by eliminating the dependence on a centralized, high-bandwidth networking fabric. Our method trains a set of expert diffusion models over partitions of the dataset, each in full isolation from one another. At inference time, the experts ensemble through a lightweight router. We show that the ensemble collectively optimizes the same objective as a single model trained over the whole dataset. This means we can divide the training burden among a number of "compute islands," lowering infrastructure costs and improving resilience to localized GPU failures. Decentralized diffusion models empower researchers to take advantage of smaller, more cost-effective and more readily available compute like on-demand GPU nodes rather than central integrated systems. We conduct extensive experiments on ImageNet and LAION Aesthetics, showing that decentralized diffusion models FLOP-for-FLOP outperform standard diffusion models. We finally scale our approach to 24 billion parameters, demonstrating that high-quality diffusion models can now be trained with just eight individual GPU nodes in less than a week.

Related papers

Performance Analysis of Decentralized Federated Learning Deployments [1.7249361224827533]
Decentralized Federated Learning (DFL) is introduced to address these challenges. It facilitates direct collaboration among participating devices without relying on a central server. This work explores crucial factors influencing the convergence and generalization capacity of DFL models.
arXiv Detail & Related papers (2025-03-14T19:37:13Z)
Federated Learning for Diffusion Models [12.46092849473786]
Diffusion models are powerful generative models that can produce highly realistic samples for various tasks. We propose FedDDPM-Federated Learning with Denoising Diffusion Probabilistic Models. We provide a rigorous convergence analysis of FedDDPM and propose an enhanced algorithm, FedDDPM+, to reduce training overheads.
arXiv Detail & Related papers (2025-03-09T03:41:10Z)
Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule [50.260693393896716]
Diffusion models are cutting-edge generative models adept at producing diverse, high-quality images. Recent techniques have been employed to automatically search for faster generation processes. We introduce Flexiffusion, a novel training-free NAS paradigm designed to accelerate diffusion models.
arXiv Detail & Related papers (2024-09-26T06:28:05Z)
Constrained Diffusion Models via Dual Training [80.03953599062365]
Diffusion processes are prone to generating samples that reflect biases in a training dataset. We develop constrained diffusion models by imposing diffusion constraints based on desired distributions. We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z)
Guided Diffusion from Self-Supervised Diffusion Features [49.78673164423208]
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or pretraining. We propose a framework to extract guidance from, and specifically for, diffusion models.
arXiv Detail & Related papers (2023-12-14T11:19:11Z)
NeFL: Nested Model Scaling for Federated Learning with System Heterogeneous Clients [44.89061671579694]
Federated learning (FL) enables distributed training while preserving data privacy, but stragglers-slow or incapable clients-can significantly slow down the total training time and degrade performance. We propose nested federated learning (NeFL), a framework that efficiently divides deep neural networks into submodels using both depthwise and widthwise scaling. NeFL achieves performance gain, especially for the worst-case submodel compared to baseline approaches.
arXiv Detail & Related papers (2023-08-15T13:29:14Z)
Fed-FSNet: Mitigating Non-I.I.D. Federated Learning via Fuzzy Synthesizing Network [19.23943687834319]
Federated learning (FL) has emerged as a promising privacy-preserving distributed machine learning framework. We propose a novel FL training framework, dubbed Fed-FSNet, using a properly designed Fuzzy Synthesizing Network (FSNet) to mitigate the Non-I.I.D. at-the-source issue.
arXiv Detail & Related papers (2022-08-21T18:40:51Z)
Supernet Training for Federated Image Classification under System Heterogeneity [15.2292571922932]
In this work, we propose a novel framework to consider both scenarios, namely Federation of Supernet Training (FedSup) It is inspired by how averaging parameters in the model aggregation stage of Federated Learning (FL) is similar to weight-sharing in supernet training. Under our framework, we present an efficient algorithm (E-FedSup) by sending the sub-model to clients in the broadcast stage for reducing communication costs and training overhead.
arXiv Detail & Related papers (2022-06-03T02:21:01Z)
Decentralized Training of Foundation Models in Heterogeneous Environments [77.47261769795992]
Training foundation models, such as GPT-3 and PaLM, can be extremely expensive. We present the first study of training large foundation models with model parallelism in a decentralized regime over a heterogeneous network.
arXiv Detail & Related papers (2022-06-02T20:19:51Z)
Parallel Successive Learning for Dynamic Distributed Model Training over Heterogeneous Wireless Networks [50.68446003616802]
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices. We develop parallel successive learning (PSL), which expands the FedL architecture along three dimensions. Our analysis sheds light on the notion of cold vs. warmed up models, and model inertia in distributed machine learning.
arXiv Detail & Related papers (2022-02-07T05:11:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.