Fugu-MT 論文翻訳(概要): LL3M: Large Language 3D Modelers

論文の概要: LL3M: Large Language 3D Modelers

arxiv url: http://arxiv.org/abs/2508.08228v1
Date: Mon, 11 Aug 2025 17:48:02 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-12 21:23:29.246956
Title: LL3M: Large Language 3D Modelers
Title（参考訳）: LL3M: 大規模言語3Dモデリング
Authors: Sining Lu, Guan Chen, Nam Anh Dinh, Itai Lang, Ari Holtzman, Rana Hanocka,
Abstract要約: LL3Mは,解釈可能なPythonコードをBlenderで記述することで3Dアセットを生成するシステムである。形状生成をコード記述タスクとして再構成し,モジュール性,編集性,アーティストBlenderとの連携を実現する。本実験では,3次元アセット生成のための生成的・解釈可能な媒体としてのコードの有用性を示す。
参考スコア（独自算出の注目度）: 18.23329430829059
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present LL3M, a multi-agent system that leverages pretrained large language models (LLMs) to generate 3D assets by writing interpretable Python code in Blender. We break away from the typical generative approach that learns from a collection of 3D data. Instead, we reformulate shape generation as a code-writing task, enabling greater modularity, editability, and integration with artist workflows. Given a text prompt, LL3M coordinates a team of specialized LLM agents to plan, retrieve, write, debug, and refine Blender scripts that generate and edit geometry and appearance. The generated code works as a high-level, interpretable, human-readable, well-documented representation of scenes and objects, making full use of sophisticated Blender constructs (e.g. B-meshes, geometry modifiers, shader nodes) for diverse, unconstrained shapes, materials, and scenes. This code presents many avenues for further agent and human editing and experimentation via code tweaks or procedural parameters. This medium naturally enables a co-creative loop in our system: agents can automatically self-critique using code and visuals, while iterative user instructions provide an intuitive way to refine assets. A shared code context across agents enables awareness of previous attempts, and a retrieval-augmented generation knowledge base built from Blender API documentation, BlenderRAG, equips agents with examples, types, and functions empowering advanced modeling operations and code correctness. We demonstrate the effectiveness of LL3M across diverse shape categories, style and material edits, and user-driven refinements. Our experiments showcase the power of code as a generative and interpretable medium for 3D asset creation. Our project page is at https://threedle.github.io/ll3m.
Abstract（参考訳）: プリトレーニング済みの大規模言語モデル(LLM)を利用して,Blenderで解釈可能なPythonコードを記述することで,3Dアセットを生成するマルチエージェントシステムであるLL3Mを提案する。私たちは3Dデータの収集から学ぶ典型的な生成的アプローチから脱却します。代わりに、形状生成をコード記述タスクとして再構成し、モジュール化、編集性、アーティストワークフローとの統合を可能にします。テキストプロンプトが与えられた後、LL3Mは特殊なLLMエージェントのチームをコーディネートして、幾何学と外観を生成・編集するBlenderスクリプトを計画、検索、書き込み、デバッグ、洗練する。生成されたコードは、シーンとオブジェクトの高レベル、解釈可能、人間可読、文書化された表現として機能し、多様で制約のない形状、材料、シーンに洗練されたブレンダー構造(例えば、Bメッシュ、幾何学修飾子、シェーダーノード)をフル活用する。このコードは、さらなるエージェントや人間の編集や、コードの微調整や手続き的パラメータによる実験のための多くの道を示す。エージェントはコードと視覚を使って自動的に自己批判を行うことができ、反復的なユーザー指示は資産を改良する直感的な方法を提供する。エージェント間の共有コードコンテキストは、以前の試みの認識を可能にし、Blender APIドキュメント、BlenderRAGから構築された検索強化された生成知識ベースは、エージェントに、高度なモデリング操作とコードの正確性を促進するための例、型、機能を提供する。 LL3Mは,様々な形状のカテゴリ,スタイルや素材の編集,ユーザ主導の洗練などにおいて有効であることを示す。本実験では,3次元アセット生成のための生成的・解釈可能な媒体としてのコードの有用性を示す。私たちのプロジェクトページはhttps:// Threedle.github.io/ll3m.comです。

論文の概要: LL3M: Large Language 3D Modelers

関連論文リスト