Related papers: ComUICoder: Component-based Reusable UI Code Generation for Complex Websites via Semantic Segmentation and Element-wise Feedback

ComUICoder: Component-based Reusable UI Code Generation for Complex Websites via Semantic Segmentation and Element-wise Feedback

URL: http://arxiv.org/abs/2602.19276v1
Date: Sun, 22 Feb 2026 17:17:16 GMT
Title: ComUICoder: Component-based Reusable UI Code Generation for Complex Websites via Semantic Segmentation and Element-wise Feedback
Authors: Jingyu Xiao, Jiantong Qin, Shuoqi Li, Man Ho Lam, Yuxuan Wan, Jen-tse Huang, Yintong Huo, Michael R. Lyu,
Abstract summary: We introduce ComUICoder, a semantic-aware code generation tool for complex websites.<n>ComUICoder significantly improves overall generation quality and code reusability on complex multipage websites.
Score: 38.10354940578983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal Large Language Models (MLLMs) have demonstrated strong performance on the UI-to-code task, which aims to generate UI code from design mock-ups. However, when applied to long and complex websites, they often struggle with fragmented segmentation, redundant code generation for repetitive components, and frequent UI inconsistencies. To systematically investigate and address these challenges, we introduce ComUIBench, a new multi-page complex webpage benchmark with component annotations, designed to evaluate MLLMs' ability to generate reusable UI code in realistic website scenarios. Building upon this benchmark, we propose ComUICoder, a component-based UI code generation framework that emphasizes semantic-aware segmentation, code reuse, and fine-grained refinement. Specifically, ComUICoder incorporates (1) Hybrid Semantic-aware Block Segmentation for accurate UI semantic coherent block detection, (2) Visual-aware Graph-based Block Merge to consolidate structurally similar components within and across webpages for reusable implementation, and (3) Priority-based Element-wise Feedback to refine generated code and reduce element-level inconsistencies. Extensive experiments demonstrate that ComUICoder significantly improves overall generation quality and code reusability on complex multipage websites. Our datasets and code are publicly available at https://github.com/WebPAI/ComUICoder.

Related papers

VSA:Visual-Structural Alignment for UI-to-Code [29.15071743239679]
We propose bfVSA (VSA), a multi-stage paradigm designed to synthesize organized assets through visual-text alignment.<n>Our framework yields a substantial improvement in code modularity and architectural consistency over state-of-the-art benchmarks.
arXiv Detail & Related papers (2025-12-23T03:55:45Z)
Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs [28.028216548288725]
We formalize the Widget-to-Code (Widget2Code) setting and introduce an image-only widget benchmark with fine-grained, multi-dimensional evaluation metrics.<n> Benchmarking shows that although generalized large language models (MLLMs) outperform specialized UI2Code methods, they still produce unreliable and visually inconsistent code.<n>At the perceptual level, we follow widget design principles to assemble atomic components into complete layouts, equipped with icon retrieval and reusable visualization modules.
arXiv Detail & Related papers (2025-12-22T22:45:39Z)
DesignCoder: Hierarchy-Aware and Self-Correcting UI Code Generation with Large Language Models [17.348284143568282]
DesignCoder is a novel hierarchical-aware and self-correcting automated code generation framework.<n>We introduce UI Grouping Chains, which enhance MLLMs' capability to understand and predict complex nested UI hierarchies.<n>We also incorporate a self-correction mechanism to improve the model's ability to identify and rectify errors in the generated code.
arXiv Detail & Related papers (2025-06-16T16:20:43Z)
MLLM-Based UI2Code Automation Guided by UI Layout Information [17.177322441575196]
We propose a novel MLLM-based framework generating UI code from real-world webpage images, which includes three key modules.<n>For evaluation, we build a new benchmark dataset which involves 350 real-world websites named Snap2Code.
arXiv Detail & Related papers (2025-06-12T06:04:16Z)
Universal Item Tokenization for Transferable Generative Recommendation [89.42584009980676]
We propose UTGRec, a universal item tokenization approach for transferable Generative Recommendation.<n>By devising tree-structured codebooks, we discretize content representations into corresponding codes for item tokenization.<n>For raw content reconstruction, we employ dual lightweight decoders to reconstruct item text and images from discrete representations.<n>For collaborative knowledge integration, we assume that co-occurring items are similar and integrate collaborative signals through co-occurrence alignment and reconstruction.
arXiv Detail & Related papers (2025-04-06T08:07:49Z)
EpiCoder: Encompassing Diversity and Complexity in Code Generation [66.43738008739555]
Existing methods for code generation use code snippets as seed data.<n>We introduce a novel feature tree-based synthesis framework, which revolves around hierarchical code features.<n>Our framework provides precise control over the complexity of the generated code, enabling functionalities that range from function-level operations to multi-file scenarios.
arXiv Detail & Related papers (2025-01-08T18:58:15Z)
Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping [57.024913536420264]
Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance on the design-to-code task.<n>We present the first systematic investigation of MLLMs in generating interactive webpages.
arXiv Detail & Related papers (2024-11-05T17:40:03Z)
EGFE: End-to-end Grouping of Fragmented Elements in UI Designs with Multimodal Learning [10.885275494978478]
Grouping fragmented elements can greatly improve the readability and maintainability of the generated code. Current methods employ a two-stage strategy that introduces hand-crafted rules to group fragmented elements. We propose EGFE, a novel method for automatically End-to-end Grouping Fragmented Elements via UI sequence prediction.
arXiv Detail & Related papers (2023-09-18T15:28:12Z)
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback [50.725076393314964]
We introduce InterCode, a lightweight, flexible, and easy-to-use framework of interactive coding as a standard reinforcement learning environment. Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution. We demonstrate InterCode's viability as a testbed by evaluating multiple state-of-the-art LLMs configured with different prompting strategies.
arXiv Detail & Related papers (2023-06-26T17:59:50Z)
Boundary-Aware Segmentation Network for Mobile and Web Applications [60.815545591314915]
Boundary-Aware Network (BASNet) is integrated with a predict-refine architecture and a hybrid loss for highly accurate image segmentation. BASNet runs at over 70 fps on a single GPU which benefits many potential real applications. Based on BASNet, we further developed two (close to) commercial applications: AR COPY & PASTE, in which BASNet is augmented reality for "COPY" and "PASTING" real-world objects, and OBJECT CUT, which is a web-based tool for automatic object background removal.
arXiv Detail & Related papers (2021-01-12T19:20:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.