Fugu-MT 論文翻訳(概要): 3D-CoS: A New 3D Reconstruction Paradigm Based on VLM Code Synthesis

論文の概要: 3D-CoS: A New 3D Reconstruction Paradigm Based on VLM Code Synthesis

arxiv url: http://arxiv.org/abs/2606.10478v1
Date: Tue, 09 Jun 2026 06:46:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-10 15:40:58.357052
Title: 3D-CoS: A New 3D Reconstruction Paradigm Based on VLM Code Synthesis
Title（参考訳）: 3D-CoS: VLM符号合成に基づく新しい3次元再構成パラダイム
Authors: Yuhao Wang, Puyi Wang, Linjie Li, Zhengyuan Yang, Kevin Qinghong Lin, Yu Cheng,
Abstract要約: 本稿では,3次元アセットを実行可能なコードとして構築する新しい3次元再構成パラダイムを提案し,体系的に評価する。本研究は,3次元表現としてのコードにより,強い制御性と局所性が得られ,編集精度が向上し,未編集領域の保存性が向上することを示す。
参考スコア（独自算出の注目度）: 69.23609431485401
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Most recent 3D reconstruction and editing systems operate on implicit and explicit representations such as NeRF, point clouds, or meshes. While these representations enable high-fidelity rendering, they are fundamentally low-level and hard to control programmatically. In contrast, we propose and systematically evaluate a new 3D reconstruction paradigm, 3D Code Synthesis (3D-CoS), where 3D assets are constructed as executable Blender code, a programmatic and interpretable medium. To assess how well current VLMs can use code to represent 3D objects, we evaluate representative open-source and closed-source VLMs in code-based reconstruction under a unified protocol. We further introduce a suite of structured code-synthesis workflows, including blueprint-based planning, Retrieval-Augmented Generation (RAG) over Blender API documentation, few-shot geometric demonstrations, and a component-level Agent workflow for part-wise code generation. To demonstrate the unique advantages of this representation, we further evaluate localized text-driven modifications and compare our code-based edits with a point-cloud-based 3D editing baseline. Our study shows that code as a 3D representation offers strong controllability and locality, yielding stronger edit fidelity and better preservation of unedited regions in our targeted editing evaluation. Our work also analyzes the potential of this paradigm, delineates the current capability frontier of VLMs for programmatic 3D modeling, and highlights code synthesis as a promising direction for editable 3D reconstruction.
Abstract（参考訳）: 最近の3D再構成と編集システムは、NeRF、ポイントクラウド、メッシュなどの暗黙的かつ明示的な表現で動作する。これらの表現は高忠実なレンダリングを可能にするが、基本的に低レベルであり、プログラム的に制御することが難しい。対照的に、3Dコード合成(3D-CoS)という新しい3D再構成パラダイムを提案し,体系的に評価し,プログラム的かつ解釈可能な媒体であるブレンダーコードとして3Dアセットを構築する。 3Dオブジェクトの表現に,現在のVLMがどの程度の精度で利用できるかを評価するため,統一されたプロトコル下でのコードベース再構築において,オープンソースおよびクローズドソースのVLMを代表的に評価する。さらに、ブループリントベースのプランニング、Blender APIドキュメント上のRetrieval-Augmented Generation(RAG)、数ショットの幾何学的デモ、部分的なコード生成のためのコンポーネントレベルのAgentワークフローなど、構造化されたコード合成ワークフローも導入しています。この表現の独特な利点を実証するため、局所的なテキスト駆動による修正を更に評価し、コードベースの編集をポイントクラウドベースの3D編集ベースラインと比較した。本研究は, 3次元表現としてのコードにより, 強い制御性と局所性が得られ, 編集精度が向上し, 未編集領域の保存性が向上することを示す。このパラダイムの可能性を分析し、プログラム型3次元モデリングのためのVLMの現在の機能フロンティアを概説し、編集可能な3次元再構成のための有望な方向としてコード合成を強調する。

論文の概要: 3D-CoS: A New 3D Reconstruction Paradigm Based on VLM Code Synthesis

関連論文リスト