Fugu-MT 論文翻訳(概要): RoboLayout: Differentiable 3D Scene Generation for Embodied Agents

論文の概要: RoboLayout: Differentiable 3D Scene Generation for Embodied Agents

arxiv url: http://arxiv.org/abs/2603.05522v2
Date: Mon, 09 Mar 2026 14:05:21 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-15 16:38:22.376731
Title: RoboLayout: Differentiable 3D Scene Generation for Embodied Agents
Title（参考訳）: RoboLayout: 人工呼吸器の3Dシーン生成
Authors: Ali Shamsaddinlou,
Abstract要約: RoboはLayoutVLMの拡張として導入され、エージェント対応推論と安定性の向上によってオリジナルのフレームワークを拡張している。 Roboは、明示的な到達性制約を異なる空間的に可能なレイアウト最適化プロセスに統合し、反復的なレイアウトの生成と、実施エージェントによるアクションを可能にする。全体としてRoboは、エージェント中心の屋内シーンに適用性を高めながら、強いセマンティックアライメントと物理的妥当性を維持している。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advances in vision language models (VLMs) have shown strong potential for spatial reasoning and 3D scene layout generation from open-ended language instructions. However, generating layouts that are not only semantically coherent but also feasible for interaction by embodied agents remains challenging, particularly in physically constrained indoor environments. In this paper, RoboLayout is introduced as an extension of LayoutVLM that augments the original framework with agent-aware reasoning and improved optimization stability. RoboLayout integrates explicit reachability constraints into a differentiable layout optimization process, enabling the generation of layouts that are navigable and actionable by embodied agents. Importantly, the agent abstraction is not limited to a specific robot platform and can represent diverse entities with distinct physical capabilities, such as service robots, warehouse robots, humans of different age groups, or animals, allowing environment design to be tailored to the intended agent. In addition, a local refinement stage is proposed that selectively reoptimizes problematic object placements while keeping the remainder of the scene fixed, improving convergence efficiency without increasing global optimization iterations. Overall, RoboLayout preserves the strong semantic alignment and physical plausibility of LayoutVLM while enhancing applicability to agent-centric indoor scene generation, as demonstrated by experimental results across diverse scene configurations.
Abstract（参考訳）: 近年の視覚言語モデル(VLM)の進歩は,オープンエンド言語命令から空間推論や3次元シーンレイアウト生成に強い可能性を示している。しかし, セマンティック・コヒーレントであるだけでなく, 生体内エージェントとの相互作用に有効であるレイアウトを生成することは, 特に物理的に制約された屋内環境では困難である。本稿では,RoboLayoutをLayoutVLMの拡張として導入し,エージェント認識推論によるオリジナルのフレームワークの拡張と最適化安定性の向上を行う。 RoboLayoutは、明示的な到達性制約を差別化可能なレイアウト最適化プロセスに統合し、エンボディエージェントによってナビゲート可能で動作可能なレイアウトの生成を可能にする。重要なことは、エージェント抽象化は特定のロボットプラットフォームに限らず、サービスロボット、倉庫ロボット、異なる年齢集団の人間、動物など、異なる物理的能力を持つ多様なエンティティを表現でき、環境設計を意図されたエージェントに合わせることができる。さらに,各シーンの残りを固定しつつ,問題対象の配置を選択的に再最適化し,グローバルな最適化イテレーションを増大させることなく収束効率を向上する局所改善段階を提案する。全体として、RoboLayoutはLayoutVLMの強いセマンティックアライメントと物理的妥当性を保ちつつ、エージェント中心の屋内シーン生成への適用性を高めている。

論文の概要: RoboLayout: Differentiable 3D Scene Generation for Embodied Agents

関連論文リスト