Fugu-MT 論文翻訳(概要): D3S2: Diffusion-Guided Dataset Distillation for Semantic Segmentation

論文の概要: D3S2: Diffusion-Guided Dataset Distillation for Semantic Segmentation

arxiv url: http://arxiv.org/abs/2605.25022v1
Date: Sun, 24 May 2026 12:01:38 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 19:50:18.657903
Title: D3S2: Diffusion-Guided Dataset Distillation for Semantic Segmentation
Title（参考訳）: D3S2: セマンティックセグメンテーションのための拡散誘導型データセット蒸留
Authors: Wenjie Zheng, Haoji Hu, Jiali Lu, Xingze Zou, Jing Wang,
Abstract要約: セグメンテーションDDには3つの重要な課題がある: (i) 長い尾のクラス不均衡、 (ii) 画像と高密度ラベル間の厳密なピクセルワイドアライメントの必要性、 (iii) 複雑なモデルで高解像度データを最適化する計算コスト。クラスBalanced Mask Selectionでは、未表現のクラスを優先する欲求戦略を用いて代表マスクセットを構築する。拡散誘導画像合成では、予めトレーニングされたレイアウト・ツー・イメージ拡散モデルを用いて、選択したマスクに条件付き画像を生成し、アライメントを自然に確保する。
参考スコア（独自算出の注目度）: 8.30626759264565
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dataset distillation (DD) aims to compress large-scale datasets into compact synthetic sets while preserving training efficacy. However, existing studies mainly focus on image classification, leaving dense prediction tasks such as semantic segmentation largely underexplored. In this work, we identify three key challenges for segmentation DD: (i) long-tailed class imbalance, (ii) the need for strict pixel-wise alignment between images and dense labels, and (iii) the high computational cost of optimizing high-resolution data with complex models. To address these challenges, we propose D3S2, a Diffusion-guided Dataset Distillation framework for Semantic Segmentation. Our method adopts a two-stage design. In Class-Balanced Mask Selection, we construct a representative mask set via a greedy strategy that prioritizes underrepresented classes. In Diffusion-Guided Image Synthesis, we employ a pretrained layout-to-image diffusion model to generate images conditioned on the selected masks, naturally ensuring spatial alignment. To further enhance the training utility of synthesized data, we introduce guided diffusion sampling with two complementary objectives: a segmentation-consistency loss for pixel-level alignment, and a class-wise feature matching loss for aligning per-class feature statistics across layers. Extensive experiments demonstrate the superiority of D3S2. Notably, at an extremely compression rate of 1%, our method achieves 24.99% and 35.49% mIoU on ADE20K and COCO-Stuff with Mask2Former (Swin-S), outperforming random selection by 9.34% and 5.70%, respectively.
Abstract（参考訳）: データセット蒸留(DD)は、訓練効果を維持しながら、大規模なデータセットをコンパクトな合成セットに圧縮することを目的としている。しかし、既存の研究では主に画像分類に焦点が当てられており、セマンティックセグメンテーションのような密集した予測タスクはほとんど探索されていない。本稿では,セグメンテーションDDにおける3つの課題について述べる。 (一)長尾級不均衡二画像と濃密なラベルとの厳密なピクセルワイドアライメントの必要性三複雑なモデルで高解像度データを最適化する際の計算コストが高いこと。これらの課題に対処するため,セマンティックセグメンテーションのための拡散誘導型データセット蒸留フレームワークD3S2を提案する。我々の手法は2段階の設計を採用する。クラスBalanced Mask Selectionでは、未表現のクラスを優先する欲求戦略を用いて代表マスクセットを構築する。拡散誘導画像合成では、予めトレーニングされたレイアウト・ツー・イメージ拡散モデルを用いて、選択したマスクに条件付き画像を生成し、空間的アライメントを自然に確保する。合成データのトレーニングの有用性をさらに高めるために,画素レベルのアライメントのためのセグメンテーション・一貫性損失と,階層間のクラスごとの特徴統計を整合させるクラスワイド特徴マッチング損失の2つの相補的な目的により,ガイド付き拡散サンプリングを導入する。大規模な実験は、D3S2の優越性を実証した。特に,最大圧縮率1%でADE20KおよびCOCO-Stuff with Mask2Former(Swin-S)で24.99% mIoU,35.49% mIoUを達成し,ランダム選択を9.34%,5.70%で上回った。

論文の概要: D3S2: Diffusion-Guided Dataset Distillation for Semantic Segmentation

関連論文リスト