Fugu-MT 論文翻訳(概要): Composing Driving Worlds through Disentangled Control for Adversarial Scenario Generation

論文の概要: Composing Driving Worlds through Disentangled Control for Adversarial Scenario Generation

arxiv url: http://arxiv.org/abs/2603.12864v1
Date: Fri, 13 Mar 2026 10:10:21 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-16 17:38:12.042273
Title: Composing Driving Worlds through Disentangled Control for Adversarial Scenario Generation
Title（参考訳）: 逆シナリオ生成のためのアンタングル制御による運転世界の構成
Authors: Yifan Zhan, Zhengqing Chen, Qingjie Wang, Zhuo He, Muyao Niu, Xiaoyang Guo, Wei Yin, Weiqiang Ren, Qian Zhang, Yinqiang Zheng,
Abstract要約: 自動運転における大きな課題は、安全クリティカルなエッジケースの"ロングテール"である。交通要因を乱す構成駆動ビデオシミュレータCompoSIAを紹介する。我々は最先端のベースラインよりも優れた制御可能な生成品質を示す。
参考スコア（独自算出の注目度）: 40.89741209403581
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A major challenge in autonomous driving is the "long tail" of safety-critical edge cases, which often emerge from unusual combinations of common traffic elements. Synthesizing these scenarios is crucial, yet current controllable generative models provide incomplete or entangled guidance, preventing the independent manipulation of scene structure, object identity, and ego actions. We introduce CompoSIA, a compositional driving video simulator that disentangles these traffic factors, enabling fine-grained control over diverse adversarial driving scenarios. To support controllable identity replacement of scene elements, we propose a noise-level identity injection, allowing pose-agnostic identity generation across diverse element poses, all from a single reference image. Furthermore, a hierarchical dual-branch action control mechanism is introduced to improve action controllability. Such disentangled control enables adversarial scenario synthesis-systematically combining safe elements into dangerous configurations that entangled generators cannot produce. Extensive comparisons demonstrate superior controllable generation quality over state-of-the-art baselines, with a 17% improvement in FVD for identity editing and reductions of 30% and 47% in rotation and translation errors for action control. Furthermore, downstream stress-testing reveals substantial planner failures: across editing modalities, the average collision rate of 3s increases by 173%.
Abstract（参考訳）: 自動運転における大きな課題は、安全クリティカルなエッジケースの「長い尾」であり、しばしば共通の交通要素の異常な組み合わせから生じる。これらのシナリオを合成することは重要であるが、現在の制御可能な生成モデルは不完全または絡み合ったガイダンスを提供し、シーン構造、オブジェクトのアイデンティティ、エゴアクションの独立的な操作を防ぐ。 CompoSIAは、これらの交通要因を分散させ、多様な対向運転シナリオのきめ細かい制御を可能にする構成駆動ビデオシミュレータである。シーン要素の制御可能なアイデンティティ置換を支援するため,単一参照画像から様々な要素のポーズにまたがるポーズ非依存のアイデンティティ生成が可能なノイズレベルアイデンティティインジェクションを提案する。さらに、動作制御性を改善するために、階層的なデュアルブランチ動作制御機構を導入する。このような絡み合った制御は、安全な要素を危険な構成に体系的に組み合わせて、絡み合ったジェネレータが生成できない敵のシナリオを合成することを可能にする。大規模な比較では、最先端のベースラインよりも制御可能な生成品質が向上し、アイデンティティ編集のためのFVDが17%向上し、動作制御のための回転および翻訳エラーの30%と47%が削減された。さらに、下流でのストレステストでは、編集モダリティによって3sの平均衝突速度が173%増加するという、プランナーの重大な障害が明らかにされている。

論文の概要: Composing Driving Worlds through Disentangled Control for Adversarial Scenario Generation

関連論文リスト