Fugu-MT 論文翻訳(概要): RSGen: Enhancing Layout-Driven Remote Sensing Image Generation with Diverse Edge Guidance

論文の概要: RSGen: Enhancing Layout-Driven Remote Sensing Image Generation with Diverse Edge Guidance

arxiv url: http://arxiv.org/abs/2603.15484v2
Date: Tue, 17 Mar 2026 07:32:14 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-18 13:19:44.048919
Title: RSGen: Enhancing Layout-Driven Remote Sensing Image Generation with Diverse Edge Guidance
Title（参考訳）: RSGen: 複数エッジガイダンスによるレイアウト駆動型リモートセンシング画像生成の強化
Authors: Xianbao Hou, Yonghao He, Zeyd Boukhers, John See, Hu Su, Wei Sui, Cong Yang,
Abstract要約: 拡散モデルはリモートセンシングにおける注釈付きデータの不足の影響を著しく緩和した。近年のアプローチでは、これらのモデルを利用して、多様かつ制御可能なレイアウト・トゥ・イメージ合成を実現している。本稿では,多様なエッジガイダンスを活用し,レイアウト駆動型RS画像生成を向上するプラグイン・アンド・プレイ・フレームワークであるRSGenを提案する。
参考スコア（独自算出の注目度）: 15.916510585915406
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models have significantly mitigated the impact of annotated data scarcity in remote sensing (RS). Although recent approaches have successfully harnessed these models to enable diverse and controllable Layout-to-Image (L2I) synthesis, they still suffer from limited fine-grained control and fail to strictly adhere to bounding box constraints. To address these limitations, we propose RSGen, a plug-and-play framework that leverages diverse edge guidance to enhance layout-driven RS image generation. Specifically, RSGen employs a progressive enhancement strategy: 1) it first enriches the diversity of edge maps composited from retrieved training instances via Image-to-Image generation; and 2) subsequently utilizes these diverse edge maps as conditioning for existing L2I models to enforce pixel-level control within bounding boxes, ensuring the generated instances strictly adhere to the layout. Extensive experiments across three baseline models demonstrate that RSGen significantly boosts the capabilities of existing L2I models. For instance, with CC-Diff on the DOTA dataset for oriented object detection, we achieve remarkable gains of +9.8/+12.0 in YOLOScore mAP50/mAP50-95 and +1.6 in mAP on the downstream detection task. Our code will be publicly available: https://github.com/D-Robotics-AI-Lab/RSGen
Abstract（参考訳）: 拡散モデルはリモートセンシング(RS)における注釈付きデータ不足の影響を著しく緩和した。近年のアプローチでは、多種多様なLayout-to-Image(L2I)合成を可能にするためにこれらのモデルを活用することに成功したが、それでも細粒度制御に悩まされ、境界ボックスの制約に厳密に従わなかった。これらの制約に対処するために,多様なエッジガイダンスを利用してレイアウト駆動型RS画像生成を向上するプラグイン・アンド・プレイ・フレームワークであるRSGenを提案する。具体的には、RSGenはプログレッシブエンハンスメント戦略を採用しています。 1)画像から画像を生成することにより,検索したトレーニングインスタンスから合成したエッジマップの多様性を向上する。 2) 既存のL2Iモデルでは,これらの多様なエッジマップを条件付けとして,バウンディングボックス内でピクセルレベルの制御を強制し,生成されたインスタンスがレイアウトに厳密に準拠するようにした。 3つのベースラインモデルにわたる大規模な実験により、RSGenは既存のL2Iモデルの能力を大幅に向上することを示した。例えば、オブジェクト指向オブジェクト検出のためのDOTAデータセットのCC-Diffでは、YOLOScore mAP50/mAP50-95で+9.8/+12.0、下流検出タスクで+1.6で+9.8/+12.0を達成しています。私たちのコードは、https://github.com/D-Robotics-AI-Lab/RSGenで公開されます。

論文の概要: RSGen: Enhancing Layout-Driven Remote Sensing Image Generation with Diverse Edge Guidance

関連論文リスト