Fugu-MT 論文翻訳(概要): P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning

論文の概要: P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning

arxiv url: http://arxiv.org/abs/2606.11152v2
Date: Wed, 10 Jun 2026 16:12:45 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-11 14:23:44.412107
Title: P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning
Title（参考訳）: P3D-Bench:パラメトリック3次元生成と構造推論のためのベンチマークMLLM
Authors: Yikang Yang, Zhanpeng Hu, Youtian Lin, Mengqi Zhou, Jingxi Xu, Feihu Zhang, Jiaheng Liu, Yao Yao,
Abstract要約: パラメトリックな3D生成のためのベンチマークであるP3D-Benchを紹介する。統一されたプロトコルの下でP3D-Benchは3つのタスクファミリ(Text-to-3D, Image-to-3D, Assembly-3D)をカバーするテキストケース400件,画像ケース400件,アノテートアセンブリ203件について,フロンティアMLLMとテキストのみLLMを評価した。
参考スコア（独自算出の注目度）: 37.03209423305997
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal large language models can write code to produce complex programs as well as use programs to do 3D modeling, which opens up a new avenue for 3D generation powered by their priors, world knowledge and reasoning. Yet existing benchmarks rarely evaluate 3D modeling through code. Such modeling demands more than runnable code: from a text or visual specification, a model must generate a parametric 3D program that is geometrically precise, semantically aligned and assembly-consistent. We introduce P3D-Bench, a benchmark for parametric 3D generation. Unlike a 3D mesh, a parametric 3D program exposes explicit dimensions, construction operations and part relations, revealing whether a model recovers a design's structure, not just its appearance. Under a unified protocol, P3D-Bench covers three task families (Text-to-3D, Image-to-3D and Assembly-3D) and scores each output for executability, geometric fidelity, topology, text-grounded constraints, multiview semantic alignment and part-level structure. We evaluate frontier MLLMs and text-only LLMs on 400 text cases, 400 image cases and 203 annotated assemblies, with domain-specific models as reference points. Our extensive evaluation yields three findings. First, assemblies are the hardest setting, where models still fail to compose multiple parts into a coherent structure. Second, models can often recover the global shape and semantic identity of the target object, yet fail to reproduce the precise parametric geometry specified by the input. Third, part-level modeling remains weak on assemblies, where models recover neither the geometry of each part nor the right number of parts. These results position P3D-Bench as a benchmark for evaluating precise parametric geometry and part-level structure in parametric 3D generation.
Abstract（参考訳）: マルチモーダルな大規模言語モデルは、複雑なプログラムを生成するためにコードを書くだけでなく、3Dモデリングを行うプログラムを使用することができる。しかし、既存のベンチマークはコードを通して3Dモデリングを評価することは滅多にない。テキストやビジュアル仕様から、モデルは幾何学的に正確で、セマンティックに整合し、組み立てに一貫性のあるパラメトリックな3Dプログラムを生成する必要がある。パラメトリックな3D生成のためのベンチマークであるP3D-Benchを紹介する。 3Dメッシュとは異なり、パラメトリックな3Dプログラムは明示的な次元、建設操作、部分関係を公開し、モデルが外観だけでなくデザインの構造を復元するかどうかを明らかにする。統一されたプロトコルの下では、P3D-Benchは3つのタスクファミリ(Text-to-3D, Image-to-3D, Assembly-3D)をカバーし、実行可能性、幾何学的忠実度、トポロジー、テキストグラウンド制約、マルチビューセマンティックアライメント、部分レベル構造の各アウトプットをスコアする。テキストケース400件,画像ケース400件,アノテーション付きアセンブリ203件について,ドメイン固有モデルを基準点として,フロンティアMLLMとテキストのみLLMを評価した。我々の広範な評価は3つの結果をもたらす。まず、アセンブリは最も難しい設定であり、モデルが複数のパーツをコヒーレントな構造に組み立てることに失敗する。第二に、モデルはしばしば対象オブジェクトのグローバルな形状と意味的アイデンティティを復元するが、入力によって指定された正確なパラメトリック幾何を再現することができない。第三に、部品レベルのモデリングはアセンブリに弱いままであり、各部品の幾何や部品の正しい数も復元しない。これらの結果は、P3D-Benchをパラメトリックな3次元生成における精密なパラメトリック幾何と部分レベルの構造を評価するためのベンチマークとして位置づけた。

論文の概要: P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning

関連論文リスト