Fugu-MT 論文翻訳(概要): Paris: A Decentralized Trained Open-Weight Diffusion Model

論文の概要: Paris: A Decentralized Trained Open-Weight Diffusion Model

arxiv url: http://arxiv.org/abs/2510.03434v1
Date: Fri, 03 Oct 2025 18:53:12 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-07 16:52:59.041344
Title: Paris: A Decentralized Trained Open-Weight Diffusion Model
Title（参考訳）: パリ:分散学習型オープンウェイト拡散モデル
Authors: Zhiying Jiang, Raihan Seraj, Marcos Villagra, Bidhan Roy,
Abstract要約: 分散計算によって完全に事前訓練された最初の公開拡散モデルであるParisを提示する。 Paris氏は、インフラストラクチャを集中的に調整することなく、高品質のテキスト・ツー・イメージ生成が実現可能であることを実証している。
参考スコア（独自算出の注目度）: 11.120199309935435
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present Paris, the first publicly released diffusion model pre-trained entirely through decentralized computation. Paris demonstrates that high-quality text-to-image generation can be achieved without centrally coordinated infrastructure. Paris is open for research and commercial use. Paris required implementing our Distributed Diffusion Training framework from scratch. The model consists of 8 expert diffusion models (129M-605M parameters each) trained in complete isolation with no gradient, parameter, or intermediate activation synchronization. Rather than requiring synchronized gradient updates across thousands of GPUs, we partition data into semantically coherent clusters where each expert independently optimizes its subset while collectively approximating the full distribution. A lightweight transformer router dynamically selects appropriate experts at inference, achieving generation quality comparable to centrally coordinated baselines. Eliminating synchronization enables training on heterogeneous hardware without specialized interconnects. Empirical validation confirms that Paris's decentralized training maintains generation quality while removing the dedicated GPU cluster requirement for large-scale diffusion models. Paris achieves this using 14$\times$ less training data and 16$\times$ less compute than the prior decentralized baseline.
Abstract（参考訳）: 分散計算によって完全に事前訓練された最初の公開拡散モデルであるParisを提示する。 Paris氏は、インフラストラクチャを集中的に調整することなく、高品質のテキスト・ツー・イメージ生成が実現可能であることを実証している。パリは研究と商業利用が可能である。 Parisは、スクラッチから分散拡散トレーニングフレームワークを実装する必要がありました。モデルは8つの専門拡散モデル(それぞれ129M-605Mパラメータ)から構成され、勾配、パラメータ、中間活性化同期を伴わない完全分離で訓練される。数千のGPU間で同期された勾配更新を必要とするのではなく、データをセマンティックに一貫性のあるクラスタに分割する。軽量トランスルータは、中央に調整されたベースラインに匹敵する生成品質を達成するため、推論時に適切な専門家を動的に選択する。同期の除去は、特別な相互接続なしに異種ハードウェア上でのトレーニングを可能にする。経験的検証により、パリの分散トレーニングは、大規模な拡散モデルのための専用のGPUクラスタ要件を取り除きながら、生成品質を維持していることを確認した。 Parisは14$\times$少ないトレーニングデータと16$\times$前の分散ベースラインよりも少ない計算データを使ってこれを達成している。

論文の概要: Paris: A Decentralized Trained Open-Weight Diffusion Model

関連論文リスト