DesignCoder: Hierarchy-Aware and Self-Correcting UI Code Generation with Large Language Models
- URL: http://arxiv.org/abs/2506.13663v1
- Date: Mon, 16 Jun 2025 16:20:43 GMT
- Title: DesignCoder: Hierarchy-Aware and Self-Correcting UI Code Generation with Large Language Models
- Authors: Yunnong Chen, Shixian Ding, YingYing Zhang, Wenkai Chen, Jinzhou Du, Lingyun Sun, Liuqing Chen,
- Abstract summary: DesignCoder is a novel hierarchical-aware and self-correcting automated code generation framework.<n>We introduce UI Grouping Chains, which enhance MLLMs' capability to understand and predict complex nested UI hierarchies.<n>We also incorporate a self-correction mechanism to improve the model's ability to identify and rectify errors in the generated code.
- Score: 17.348284143568282
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multimodal large language models (MLLMs) have streamlined front-end interface development by automating code generation. However, these models also introduce challenges in ensuring code quality. Existing approaches struggle to maintain both visual consistency and functional completeness in the generated components. Moreover, they lack mechanisms to assess the fidelity and correctness of the rendered pages. To address these issues, we propose DesignCoder, a novel hierarchical-aware and self-correcting automated code generation framework. Specifically, we introduce UI Grouping Chains, which enhance MLLMs' capability to understand and predict complex nested UI hierarchies. Subsequently, DesignCoder employs a hierarchical divide-and-conquer approach to generate front-end code. Finally, we incorporate a self-correction mechanism to improve the model's ability to identify and rectify errors in the generated code. Extensive evaluations on a dataset of UI mockups collected from both open-source communities and industry projects demonstrate that DesignCoder outperforms state-of-the-art baselines in React Native, a widely adopted UI framework. Our method achieves a 37.63%, 9.52%, 12.82% performance increase in visual similarity metrics (MSE, CLIP, SSIM) and significantly improves code structure similarity in terms of TreeBLEU, Container Match, and Tree Edit Distance by 30.19%, 29.31%, 24.67%. Furthermore, we conducted a user study with professional developers to assess the quality and practicality of the generated code. Results indicate that DesignCoder aligns with industry best practices, demonstrating high usability, readability, and maintainability. Our approach provides an efficient and practical solution for agile front-end development, enabling development teams to focus more on core functionality and product innovation.
Related papers
- ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents [35.10813247827737]
We introduce a modular multi-agent framework that performs user interface-to-code generation in three interpretable stages.<n>The framework improves robustness, interpretability, and fidelity over end-to-end black-box methods.<n>Our approach achieves state-of-the-art performance in layout accuracy, structural coherence, and code correctness.
arXiv Detail & Related papers (2025-07-30T16:41:21Z) - Assemble Your Crew: Automatic Multi-agent Communication Topology Design via Autoregressive Graph Generation [72.44384066166147]
Multi-agent systems (MAS) based on large language models (LLMs) have emerged as a powerful solution for dealing with complex problems across diverse domains.<n>Existing approaches are fundamentally constrained by their reliance on a template graph modification paradigm with a predefined set of agents and hard-coded interaction structures.<n>We propose ARG-Designer, a novel autoregressive model that operationalizes this paradigm by constructing the collaboration graph from scratch.
arXiv Detail & Related papers (2025-07-24T09:17:41Z) - DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation [31.237236649603123]
Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in automated front-end engineering.<n>DesignBench is a benchmark for assessing MLLMs' capabilities in automated front-end engineering.
arXiv Detail & Related papers (2025-06-06T17:21:21Z) - UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding [84.87802580670579]
We introduce UniToken, an auto-regressive generation model that encodes visual inputs through a combination of discrete and continuous representations.<n>Our unified visual encoding framework captures both high-level semantics and low-level details, delivering multidimensional information.
arXiv Detail & Related papers (2025-04-06T09:20:49Z) - Harmonizing Visual Representations for Unified Multimodal Understanding and Generation [53.01486796503091]
We present emphHarmon, a unified autoregressive framework that harmonizes understanding and generation tasks with a shared MAR encoder.<n>Harmon achieves state-of-the-art image generation results on the GenEval, MJHQ30K and WISE benchmarks.
arXiv Detail & Related papers (2025-03-27T20:50:38Z) - Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks.<n>However, improvement is plateauing due to the exhaustion of readily available high-quality data.<n>We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z) - EpiCoder: Encompassing Diversity and Complexity in Code Generation [49.170195362149386]
Existing methods for code generation use code snippets as seed data.<n>We introduce a novel feature tree-based synthesis framework, which revolves around hierarchical code features.<n>Our framework provides precise control over the complexity of the generated code, enabling functionalities that range from function-level operations to multi-file scenarios.
arXiv Detail & Related papers (2025-01-08T18:58:15Z) - See-Saw Generative Mechanism for Scalable Recursive Code Generation with Generative AI [0.0]
This paper introduces the See-Saw generative mechanism, a novel methodology for dynamic and iterative code generation.
The proposed approach alternates between main code updates and dependency generation to ensure alignment and functionality.
The mechanism ensures that all code components are synchronized and functional, enabling scalable and efficient project generation.
arXiv Detail & Related papers (2024-11-16T18:54:56Z) - CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models [106.11371409170818]
Large language models (LLMs) can act as agents with capabilities to self-refine and improve generated code autonomously.
We propose CodeTree, a framework for LLM agents to efficiently explore the search space in different stages of the code generation process.
Specifically, we adopted a unified tree structure to explicitly explore different coding strategies, generate corresponding coding solutions, and subsequently refine the solutions.
arXiv Detail & Related papers (2024-11-07T00:09:54Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - Bridging Design and Development with Automated Declarative UI Code Generation [18.940075474582564]
Declarative UI frameworks have gained widespread adoption in mobile app development, offering benefits such as improved code readability and easier maintenance.
Recent advancements in multimodal large language models (MLLMs) have shown promise in directly generating mobile app code from user interface (UI) designs.
We propose DeclarUI, an automated approach that synergizes computer vision (CV), MLLMs, and iterative compiler-driven optimization to generate and refine declarative UI code from designs.
arXiv Detail & Related papers (2024-09-18T03:04:12Z) - ALaRM: Align Language Models via Hierarchical Rewards Modeling [41.79125107279527]
We introduce ALaRM, the first framework modeling hierarchical rewards in reinforcement learning from human feedback.
The framework addresses the limitations of current alignment approaches, by integrating holistic rewards with aspect-specific rewards.
We validate our approach through applications in long-form question answering and machine translation tasks.
arXiv Detail & Related papers (2024-03-11T14:28:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.