Fugu-MT 論文翻訳(概要): AtteConDA: Attention-Based Conflict Suppression in Multi-Condition Diffusion Models and Synthetic Data Augmentation

論文の概要: AtteConDA: Attention-Based Conflict Suppression in Multi-Condition Diffusion Models and Synthetic Data Augmentation

arxiv url: http://arxiv.org/abs/2605.09425v1
Date: Sun, 10 May 2026 08:56:08 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:50.244559
Title: AtteConDA: Attention-Based Conflict Suppression in Multi-Condition Diffusion Models and Synthetic Data Augmentation
Title（参考訳）: AtteConDA:マルチコンディション拡散モデルにおける注意に基づく衝突抑制と合成データ拡張
Authors: Shogo Noguchi,
Abstract要約: 本研究は,マルチ条件生成における条件競合に対処して画像生成研究に寄与する。これは、ハイレベルな自動運転タスクにおけるデータの不足を軽減するための重要なステップを提供する。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent conditional image generation methods can improve controllability by generating images that are faithful to conditions such as sketches, human poses, segmentation maps, and depth. By applying these techniques to image augmentation while preserving annotations, generated images can be used as additional training data and can improve recognition performance. However, for high-level driving tasks such as traffic-rule extraction and driving-behavior understanding, simply using annotations as conditions is insufficient. Instead, images must be augmented while preserving the detailed high-level structure of the original scene. One possible solution is to use multiple conditions so that generated images retain diverse structural cues after generation. However, when multiple conditions are used, conflicts among conditions can prevent reliable structure preservation. In this work, we input semantic segmentation, depth, and edges extracted from the original image into a multi-condition image generation model, thereby providing rich structural information as conditions. We further propose a modeling approach for handling conflicts among multiple conditions and show that it enables image generation with stronger structural preservation. We also build a generation framework and evaluation protocol for driving tasks, establishing a basis for comparison with prior and future models. As a result, this work contributes to image generation research by addressing condition conflicts in multi-condition generation and provides an important step toward mitigating data scarcity in high-level autonomous-driving tasks.
Abstract（参考訳）: 近年の条件付き画像生成手法は,スケッチや人間のポーズ,セグメンテーションマップ,深度といった条件に忠実な画像を生成することにより,制御性を向上させることができる。アノテーションを保存しながら画像拡張にこれらの技術を適用することで、生成された画像を追加のトレーニングデータとして使用することができ、認識性能を向上させることができる。しかし、交通ルール抽出や運転行動理解のような高レベルの運転タスクでは、単にアノテーションを条件として使うだけでは不十分である。代わりに、画像は元のシーンの詳細な高レベルな構造を維持しながら拡張されなければならない。可能な解決策の1つは、生成した画像が生成後に様々な構造的手がかりを保持するために複数の条件を使用することである。しかし、複数の条件を使用する場合、条件間の衝突により信頼性の高い構造保存が防止される。本研究では,原画像から抽出したセマンティックセグメンテーション,深さ,エッジを多条件画像生成モデルに入力し,リッチな構造情報を条件として提供する。さらに,複数条件間の衝突をモデル化する手法を提案し,構造保存性を高めた画像生成を可能にすることを示す。また、タスクを駆動するための生成フレームワークと評価プロトコルを構築し、前と将来のモデルと比較するための基盤を確立する。その結果,マルチコンディション生成における条件競合に対処して画像生成研究に寄与し,高レベルの自動運転タスクにおけるデータ不足を軽減するための重要なステップを提供する。

論文の概要: AtteConDA: Attention-Based Conflict Suppression in Multi-Condition Diffusion Models and Synthetic Data Augmentation

関連論文リスト