Related papers: Envisioning Future Interactive Web Development: Editing Webpage with Natural Language

Envisioning Future Interactive Web Development: Editing Webpage with Natural Language

URL: http://arxiv.org/abs/2510.26516v1
Date: Thu, 30 Oct 2025 14:09:50 GMT
Title: Envisioning Future Interactive Web Development: Editing Webpage with Natural Language
Authors: Truong Hai Dang, Jingyu Xiao, Yintong Huo,
Abstract summary: We introduce a novel, automated data generation pipeline that uses Large Language Models to synthesize a high-quality fine-tuning dataset for web editing.<n>By fine-tuning models on Instruct4Edit, we demonstrate consistent improvement in translating human intent into precise, structurally coherent, and visually accurate code changes.
Score: 5.799436684542269
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The evolution of web applications relies on iterative code modifications, a process that is traditionally manual and time-consuming. While Large Language Models (LLMs) can generate UI code, their ability to edit existing code from new design requirements (e.g., "center the logo") remains a challenge. This is largely due to the absence of large-scale, high-quality tuning data to align model performance with human expectations. In this paper, we introduce a novel, automated data generation pipeline that uses LLMs to synthesize a high-quality fine-tuning dataset for web editing, named Instruct4Edit. Our approach generates diverse instructions, applies the corresponding code modifications, and performs visual verification to ensure correctness. By fine-tuning models on Instruct4Edit, we demonstrate consistent improvement in translating human intent into precise, structurally coherent, and visually accurate code changes. This work provides a scalable and transparent foundation for natural language based web editing, demonstrating that fine-tuning smaller open-source models can achieve competitive performance with proprietary systems. We release all data, code implementations, and model checkpoints for reproduction.

Related papers

Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance [55.32799307123252]
We introduce a scalable data generation pipeline that transforms existing video editing pairs into high-fidelity training quadruplets.<n>We propose a unified editing architecture, Kiwi-Edit, that synergizes learnable queries and latent visual features for reference semantic guidance.
arXiv Detail & Related papers (2026-03-02T18:46:28Z)
Affordance Representation and Recognition for Autonomous Agents [64.39018305018904]
This paper introduces a pattern language for world modeling from structured data.<n>The DOM Transduction Pattern addresses the challenge of web page complexity.<n>The Hypermedia Affordances Recognition Pattern enables the agent to dynamically enrich its world model.
arXiv Detail & Related papers (2025-10-28T14:27:28Z)
LOCOFY Large Design Models -- Design to code conversion solution [0.0]
We introduce the Large Design Models paradigm specifically trained on designs and webpages to enable seamless conversion from design-to-code.<n>We have developed a training and inference pipeline by incorporating data engineering and appropriate model architecture modification.<n>Our models illustrated exceptional end-to-end design-to-code conversion accuracy using a novel preview match score metric.
arXiv Detail & Related papers (2025-07-22T03:54:57Z)
Contextually Guided Transformers via Low-Rank Adaptation [14.702057924366345]
Large Language Models (LLMs) based on Transformers excel at text processing, but their reliance on prompts for specialized behavior introduces computational overhead.<n>We propose a modification to a Transformer architecture that eliminates the need for explicit prompts by learning to encode context into the model's weights.
arXiv Detail & Related papers (2025-06-06T01:34:39Z)
RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs [3.41612427812159]
In digital content creation tools, users express their needs through natural language queries that must be mapped to API calls.<n>Existing approaches to synthetic data generation fail to replicate real-world data distributions.<n>We present a novel router-based architecture that generates high-quality synthetic training data.
arXiv Detail & Related papers (2025-05-15T16:53:45Z)
DreamOmni: Unified Image Generation and Editing [76.46811926046225]
We introduce Dream Omni, a unified model for image generation and editing.<n>For training, Dream Omni jointly trains T2I generation and downstream tasks.<n>This collaboration significantly boosts editing performance.
arXiv Detail & Related papers (2024-12-22T17:17:28Z)
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment [57.0121616203175]
We propose FiSAO, a novel self-alignment method that utilizes the model's own visual encoder as a fine-grained verifier to improve vision-language alignment.<n>By leveraging token-level feedback from the vision encoder, FiSAO significantly improves vision-language alignment, even surpassing traditional preference tuning methods that require additional data.
arXiv Detail & Related papers (2024-10-18T03:34:32Z)
GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding [35.536478348255315]
Recent code language models have scaled to billions of parameters, but model source code solely as text tokens.<n>We take the best of both worlds with GALLa - Graph Aligned Large Language Models.
arXiv Detail & Related papers (2024-09-06T10:57:34Z)
CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to better interpret the programming domain knowledge.<n>CodeGRAG significantly improves the code generation ability of LLMs and can even offer performance gain for cross-lingual code generation.
arXiv Detail & Related papers (2024-05-03T02:48:55Z)
Memory-Based Model Editing at Scale [102.28475739907498]
Existing model editors struggle to accurately model an edit's intended scope. We propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC) SERAC stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed.
arXiv Detail & Related papers (2022-06-13T23:40:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.