Human-in-the-Loop: Quantitative Evaluation of 3D Models Generation by Large Language Models
- URL: http://arxiv.org/abs/2509.07010v1
- Date: Sat, 06 Sep 2025 11:04:15 GMT
- Title: Human-in-the-Loop: Quantitative Evaluation of 3D Models Generation by Large Language Models
- Authors: Ahmed R. Sadik, Mariusz Bujny,
- Abstract summary: This paper introduces a human in the loop framework for the quantitative evaluation of Large Language Models generated 3D models.<n>We propose a comprehensive suite of similarity and complexity metrics, including volumetric accuracy, surface alignment, dimensional fidelity, and topological intricacy.<n>Our findings demonstrate improved generation fidelity with increased semantic richness, with code level prompts achieving perfect reconstruction.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models are increasingly capable of interpreting multimodal inputs to generate complex 3D shapes, yet robust methods to evaluate geometric and structural fidelity remain underdeveloped. This paper introduces a human in the loop framework for the quantitative evaluation of LLM generated 3D models, supporting applications such as democratization of CAD design, reverse engineering of legacy designs, and rapid prototyping. We propose a comprehensive suite of similarity and complexity metrics, including volumetric accuracy, surface alignment, dimensional fidelity, and topological intricacy, to benchmark generated models against ground truth CAD references. Using an L bracket component as a case study, we systematically compare LLM performance across four input modalities: 2D orthographic views, isometric sketches, geometric structure trees, and code based correction prompts. Our findings demonstrate improved generation fidelity with increased semantic richness, with code level prompts achieving perfect reconstruction across all metrics. A key contribution of this work is demonstrating that our proposed quantitative evaluation approach enables significantly faster convergence toward the ground truth, especially compared to traditional qualitative methods based solely on visual inspection and human intuition. This work not only advances the understanding of AI assisted shape synthesis but also provides a scalable methodology to validate and refine generative models for diverse CAD applications.
Related papers
- ArtLLM: Generating Articulated Assets via 3D LLM [19.814132638278547]
ArtLLM is a novel framework for generating high-quality articulated assets directly from complete 3D meshes.<n>At its core is a 3D multimodal large language model trained on a large-scale articulation dataset.<n> Experiments show that ArtLLM significantly outperforms state-of-the-art methods in both part layout accuracy and joint prediction.
arXiv Detail & Related papers (2026-03-01T15:07:46Z) - MoRE: 3D Visual Geometry Reconstruction Meets Mixture-of-Experts [50.37005070020306]
MoRE is a dense 3D visual foundation model based on a Mixture-of-Experts (MoE) architecture.<n>MoRE incorporates a confidence-based depth refinement module that stabilizes and refines geometric estimation.<n>It integrates dense semantic features with globally aligned 3D backbone representations for high-fidelity surface normal prediction.
arXiv Detail & Related papers (2025-10-31T06:54:27Z) - EvoCAD: Evolutionary CAD Code Generation with Vision Language Models [1.9233158329692603]
EvoCAD is a method for generating computer-aided design (CAD) objects through their symbolic representations.<n>We introduce two new metrics based on topological properties defined by the Euler characteristic, which capture a form of semantic similarity between 3D objects.
arXiv Detail & Related papers (2025-10-13T17:12:02Z) - Hierarchical Neural Semantic Representation for 3D Semantic Correspondence [72.8101601086805]
We design the hierarchical neural semantic representation (HNSR), which consists of a global semantic feature to capture high-level structure and multi-resolution local geometric features.<n>Second, we design a progressive global-to-local matching strategy, which establishes coarse semantic correspondence using the global semantic feature.<n>Third, our framework is training-free and broadly compatible with various pre-trained 3D generative backbones, demonstrating strong generalization across diverse shape categories.
arXiv Detail & Related papers (2025-09-22T07:23:07Z) - GIQ: Benchmarking 3D Geometric Reasoning of Vision Foundation Models with Simulated and Real Polyhedra [33.53387523266523]
We introduce GIQ, a benchmark specifically designed to evaluate the geometric reasoning capabilities of vision and vision-language foundation models.<n> GIQ comprises synthetic and real-world images of 224 diverse polyhedra.
arXiv Detail & Related papers (2025-06-09T20:11:21Z) - cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning [55.16668009268005]
We propose a multi-modal CAD reconstruction model that simultaneously processes all three input modalities.<n>Inspired by large language model (LLM) training paradigms, we adopt a two-stage pipeline: supervised fine-tuning (SFT) on large-scale procedurally generated data, followed by reinforcement learning (RL) fine-tuning using online feedback, obtained programatically.<n>In the DeepCAD benchmark, our SFT model outperforms existing single-modal approaches in all three input modalities simultaneously.
arXiv Detail & Related papers (2025-05-28T22:32:31Z) - PRISM: Probabilistic Representation for Integrated Shape Modeling and Generation [79.46526296655776]
PRISM is a novel approach for 3D shape generation that integrates categorical diffusion models with Statistical Shape Models (SSM) and Gaussian Mixture Models (GMM)<n>Our method employs compositional SSMs to capture part-level geometric variations and uses GMM to represent part semantics in a continuous space.<n>Our approach significantly outperforms previous methods in both quality and controllability of part-level operations.
arXiv Detail & Related papers (2025-04-06T11:48:08Z) - CADDreamer: CAD Object Generation from Single-view Images [43.59340035126575]
Existing 3D generative models often produce overly dense and unstructured meshes.<n>We introduce CADDreamer, a novel approach for generating boundary representations (B-rep) of CAD objects from a single image.<n>Results demonstrate that our method effectively recovers high-quality CAD objects from single-view images.
arXiv Detail & Related papers (2025-02-28T05:30:29Z) - JADE: Joint-aware Latent Diffusion for 3D Human Generative Modeling [62.77347895550087]
We introduce JADE, a generative framework that learns the variations of human shapes with fined-grained control.<n>Our key insight is a joint-aware latent representation that decomposes human bodies into skeleton structures.<n>To generate coherent and plausible human shapes under our proposed decomposition, we also present a cascaded pipeline.
arXiv Detail & Related papers (2024-12-29T14:18:35Z) - MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction [4.457326808146675]
This paper investigates the research task of reconstructing the 3D clothed body from a monocular image.<n>Existing approaches leverage pre-trained SMPL(-X) estimation models or generative models to provide auxiliary information for human reconstruction.<n>We propose a multi-level geometry learning framework. Technically, we design three key components: skeleton-level enhancement, joint-level augmentation, and wrinkle-level refinement modules.
arXiv Detail & Related papers (2024-12-04T08:06:06Z) - Human as Points: Explicit Point-based 3D Human Reconstruction from Single-view RGB Images [71.91424164693422]
We introduce an explicit point-based human reconstruction framework called HaP.<n>Our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space.<n>Our results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design.
arXiv Detail & Related papers (2023-11-06T05:52:29Z) - Geometric Deep Learning for Structure-Based Drug Design: A Survey [83.87489798671155]
Structure-based drug design (SBDD) leverages the three-dimensional geometry of proteins to identify potential drug candidates.
Recent advancements in geometric deep learning, which effectively integrate and process 3D geometric data, have significantly propelled the field forward.
arXiv Detail & Related papers (2023-06-20T14:21:58Z) - Translational Symmetry-Aware Facade Parsing for 3D Building
Reconstruction [11.263458202880038]
In this paper, we present a novel translational symmetry-based approach to improving the deep neural networks.
We propose a novel scheme to fuse anchor-free detection in a single stage network, which enables the efficient training and better convergence.
We employ an off-the-shelf rendering engine like Blender to reconstruct the realistic high-quality 3D models using procedural modeling.
arXiv Detail & Related papers (2021-06-02T03:10:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.