Related papers: BIMgent: Towards Autonomous Building Modeling via Computer-use Agents

BIMgent: Towards Autonomous Building Modeling via Computer-use Agents

URL: http://arxiv.org/abs/2506.07217v2
Date: Mon, 30 Jun 2025 08:31:07 GMT
Title: BIMgent: Towards Autonomous Building Modeling via Computer-use Agents
Authors: Zihan Deng, Changyu Du, Stavros Nousias, André Borrmann,
Abstract summary: We propose BIMgent, an agentic framework powered by multimodal large language models (LLMs)<n>We evaluate BIMgent on real-world building modeling tasks, including both text-based conceptual design generation and reconstruction from existing building design.<n>Results demonstrate that BIMgent effectively reduces manual workload while preserving design intent, highlighting its potential for practical deployment in real-world architectural modeling scenarios.
Score: 0.7499722271664147
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Existing computer-use agents primarily focus on general-purpose desktop automation tasks, with limited exploration of their application in highly specialized domains. In particular, the 3D building modeling process in the Architecture, Engineering, and Construction (AEC) sector involves open-ended design tasks and complex interaction patterns within Building Information Modeling (BIM) authoring software, which has yet to be thoroughly addressed by current studies. In this paper, we propose BIMgent, an agentic framework powered by multimodal large language models (LLMs), designed to enable autonomous building model authoring via graphical user interface (GUI) operations. BIMgent automates the architectural building modeling process, including multimodal input for conceptual design, planning of software-specific workflows, and efficient execution of the authoring GUI actions. We evaluate BIMgent on real-world building modeling tasks, including both text-based conceptual design generation and reconstruction from existing building design. The design quality achieved by BIMgent was found to be reasonable. Its operations achieved a 32% success rate, whereas all baseline models failed to complete the tasks (0% success rate). Results demonstrate that BIMgent effectively reduces manual workload while preserving design intent, highlighting its potential for practical deployment in real-world architectural modeling scenarios. Project page: https://tumcms.github.io/BIMgent.github.io/

Related papers

CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design [69.83433430133302]
CreatiDesign is a systematic solution for automated graphic design covering both model architecture and dataset construction.<n>First, we design a unified multi-condition driven architecture that enables flexible and precise integration of heterogeneous design elements.<n> Furthermore, to ensure that each condition precisely controls its designated image region, we propose a multimodal attention mask mechanism.
arXiv Detail & Related papers (2025-05-25T12:14:23Z)
ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges [72.19809898215857]
We introduce ModelingBench, a novel benchmark featuring real-world-inspired, open-ended problems from math modeling competitions across diverse domains.<n>These tasks require translating natural language into formal mathematical formulations, applying appropriate tools, and producing structured, defensible reports.<n>We also present ModelingAgent, a multi-agent framework that coordinates tool use, supports structured, creative solutions, and generates well-grounded, creative solutions.
arXiv Detail & Related papers (2025-05-21T03:33:23Z)
An LLM-enabled Multi-Agent Autonomous Mechatronics Design Framework [49.633199780510864]
This work proposes a multi-agent autonomous mechatronics design framework, integrating expertise across mechanical design, optimization, electronics, and software engineering.<n> operating primarily through a language-driven workflow, the framework incorporates structured human feedback to ensure robust performance under real-world constraints.<n>A fully functional autonomous vessel was developed with optimized propulsion, cost-effective electronics, and advanced control.
arXiv Detail & Related papers (2025-04-20T16:57:45Z)
Knowledge-Based Multi-Agent Framework for Automated Software Architecture Design [8.082263503892912]
We envision a Knowledge-based Multi-Agent Architecture Design (MAAD) framework.<n>MAAD uses agents to simulate human roles in the traditional software architecture design process.<n>We aim to advance the full automation of application-level system development.
arXiv Detail & Related papers (2025-03-26T13:35:10Z)
From Idea to CAD: A Language Model-Driven Multi-Agent System for Collaborative Design [0.06749750044497731]
We present an approach that mirrors this team structure with a Vision Language Model (VLM)-based Multi Agent System.<n>A model is generated automatically from sketches and/ or textual descriptions.<n>The resulting model can be refined collaboratively in an iterative validation loop with the user.
arXiv Detail & Related papers (2025-03-06T13:21:27Z)
Knowledge Graph Modeling-Driven Large Language Model Operating System (LLM OS) for Task Automation in Process Engineering Problem-Solving [0.0]
We present the Process Engineering Operations Assistant (PEOA), an AI-driven framework designed to solve complex problems in the chemical and process industries. The framework employs a modular architecture orchestrated by a meta-agent, which serves as the central coordinator. The results demonstrate the framework effectiveness in automating calculations, accelerating prototyping, and providing AI-augmented decision support for industrial processes.
arXiv Detail & Related papers (2024-08-23T13:52:47Z)
Text2BIM: Generating Building Models Using a Large Language Model-based Multi-Agent Framework [0.3749861135832073]
Text2 BIM is a multi-agent framework that generates 3D building models from natural language instructions. A rule-based model checker is introduced into the agentic workflow to guide the LLM agents in resolving issues within the generated models. The framework can effectively generate high-quality, structurally rational building models that are aligned with the abstract concepts specified by user input.
arXiv Detail & Related papers (2024-08-15T09:48:45Z)
Towards Automating the Retrospective Generation of BIM Models: A Unified Framework for 3D Semantic Reconstruction of the Built Environment [0.0]
Building Information Modeling is beneficial in construction projects. However, it faces challenges due to the lack of a unified and scalable framework for converting 3D model details into BIM. This paper introduces SR BIM, a unified semantic reconstruction architecture for BIM generation.
arXiv Detail & Related papers (2024-06-03T16:07:41Z)
Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models [81.6240188672294]
In graphic design, non-professional users often struggle to create visually appealing layouts due to limited skills and resources. We introduce a novel multimodal instruction-following framework for layout planning, allowing users to easily arrange visual elements into tailored layouts. Our method not only simplifies the design process for non-professionals but also surpasses the performance of few-shot GPT-4V models, with mIoU higher by 12% on Crello.
arXiv Detail & Related papers (2024-04-23T17:58:33Z)
3D-GPT: Procedural 3D Modeling with Large Language Models [47.72968643115063]
We introduce 3D-GPT, a framework utilizing large language models(LLMs) for instruction-driven 3D modeling. 3D-GPT positions LLMs as proficient problem solvers, dissecting the procedural 3D modeling tasks into accessible segments and appointing the apt agent for each task. Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results but also collaborates effectively with human designers.
arXiv Detail & Related papers (2023-10-19T17:41:48Z)
AutoTransfer: AutoML with Knowledge Transfer -- An Application to Graph Neural Networks [75.11008617118908]
AutoML techniques consider each task independently from scratch, leading to high computational cost. Here we propose AutoTransfer, an AutoML solution that improves search efficiency by transferring the prior architectural design knowledge to the novel task of interest.
arXiv Detail & Related papers (2023-03-14T07:23:16Z)
Efficient Automatic Machine Learning via Design Graphs [72.85976749396745]
We propose FALCON, an efficient sample-based method to search for the optimal model design. FALCON features 1) a task-agnostic module, which performs message passing on the design graph via a Graph Neural Network (GNN), and 2) a task-specific module, which conducts label propagation of the known model performance information. We empirically show that FALCON can efficiently obtain the well-performing designs for each task using only 30 explored nodes.
arXiv Detail & Related papers (2022-10-21T21:25:59Z)
Quantitatively Assessing the Benefits of Model-driven Development in Agent-based Modeling and Simulation [80.49040344355431]
This paper compares the use of MDD and ABMS platforms in terms of effort and developer mistakes. The obtained results show that MDD4ABMS requires less effort to develop simulations with similar (sometimes better) design quality than NetLogo.
arXiv Detail & Related papers (2020-06-15T23:29:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.