The Case for HTML First Web Development
- URL: http://arxiv.org/abs/2602.17193v1
- Date: Thu, 19 Feb 2026 09:23:21 GMT
- Title: The Case for HTML First Web Development
- Authors: Juho Vepsäläinen,
- Abstract summary: HTML First development puts focus on literally using HTML first when possible.<n>It seems HTML-oriented web development can provide clear benefits to developers.<n>There are open questions related to the magnitude of the benefits and the alignment with the recent trend of AI-driven web development.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Since its introduction in the early 90s, the web has become the largest application platform available globally. HyperText Markup Language (HTML) has been an essential part of the web since the beginning, as it allows defining webpages in a tree-like manner, including semantics and content. Although the web was never meant to be an application platform, it evolved as such, especially since the early 2000s, as web application frameworks became available. While the emergence of frameworks made it easier than ever to develop complex applications, it also put HTML on the back burner. As web standards caught up, especially with milestones such as HTML5, the gap between the web platform and frameworks was reduced. HTML First development emphasizes this shift and puts focus on literally using HTML first when possible, while encouraging minimalism familiar from the early days of the web. It seems HTML-oriented web development can provide clear benefits to developers, especially when it is combined with comple- mentary approaches, such as embracing hypermedia and moving a large part of application logic to the server side. In the context of the htmx project, it was observed that moving towards HTML can reduce the size of a codebase greatly while leading to maintenance and development benefits due to the increased conceptual simplicity. Holotype-based comparisons for content-oriented websites show performance benefits, and the same observation was confirmed by a small case study where the Yle website was converted to follow HTML First principles. In short, the HTML First approach seems to have clear advantages for web developers, while there are open questions related to the magnitude of the benefits and the alignment with the recent trend of AI-driven web development.
Related papers
- Avenir-Web: Human-Experience-Imitating Multimodal Web Agents with Mixture of Grounding Experts [59.68272935616536]
Avenir-Web is a web agent that achieves a new open-source state of the art on the Online-Mind2Web benchmark in real-world deployment.<n>We evaluate Avenir-Web on Online-Mind2Web, a rigorous benchmark of live and user-centered web tasks.
arXiv Detail & Related papers (2026-02-02T18:50:07Z) - MAML: Towards a Faster Web in Developing Regions [15.590501918707337]
Mobile Application Markup Language (MAML) is a flat layout-based web specification language.<n>MAML is backward compatible as it can be transpiled to minimal HTML/JavaScript/CSS.<n>MAML offers webpage speedups by tens of seconds under challenging network conditions.
arXiv Detail & Related papers (2025-01-20T18:35:53Z) - HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems [62.36019283532854]
Retrieval-Augmented Generation (RAG) has been shown to improve knowledge capabilities and alleviate the hallucination problem of LLMs.<n>We propose RAG, which uses HTML instead of plain text as the format of retrieved knowledge in RAG.
arXiv Detail & Related papers (2024-11-05T09:58:36Z) - WAFFLE: Finetuning Multi-Modal Model for Automated Front-End Development [10.34452763764075]
We introduce Waffle, a new fine-tuning strategy that uses a structure-aware attention mechanism to improve LLMs' understanding of HTML's structure.<n>Models fine-tuned with Waffle show up to 9.00 pp (percentage point) higher HTML match, 0.0982 higher CW-SSIM, 32.99 higher CLIP, and 27.12 pp higher LLEM.
arXiv Detail & Related papers (2024-10-24T01:49:49Z) - Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs [112.89665642941814]
Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio.
Current MLLMs are surprisingly poor at understanding webpage screenshots and generating their corresponding HTML code.
We propose a benchmark consisting of a new large-scale webpage-to-code dataset for instruction tuning.
arXiv Detail & Related papers (2024-06-28T17:59:46Z) - A Real-World WebAgent with Planning, Long Context Understanding, and
Program Synthesis [69.15016747150868]
We introduce WebAgent, an agent that learns from self-experience to complete tasks on real websites.
WebAgent plans ahead by decomposing instructions into canonical sub-instructions, summarizes long HTML documents into task-relevant snippets, and acts on websites.
We empirically demonstrate that our modular recipe improves the success on real websites by over 50%, and that HTML-T5 is the best model to solve various HTML understanding tasks.
arXiv Detail & Related papers (2023-07-24T14:56:30Z) - Disappearing frameworks explained [0.0]
Disappearing frameworks show their meaning as an emerging topic within the space of web application development.
The purpose of this short book is to give a quick introduction to disappearing frameworks and show their meaning as an emerging topic within the space of web application development.
arXiv Detail & Related papers (2023-05-29T07:21:38Z) - Web 3.0: The Future of Internet [53.234101208024335]
Web 3.0 is a decentralized Web architecture that is more intelligent and safer than before.
Web 3.0 is capable of addressing web data ownership according to distributed technology.
It will optimize the internet world from the perspectives of economy, culture, and technology.
arXiv Detail & Related papers (2023-03-23T15:37:42Z) - ClueWeb22: 10 Billion Web Documents with Rich Information [28.68403988636645]
ClueWeb22 provides 10 billion web pages affiliated with rich information.
Its design was influenced by the need for a high quality, large scale web corpus to support academic and industry research.
arXiv Detail & Related papers (2022-11-29T00:49:40Z) - Understanding HTML with Large Language Models [73.92747433749271]
Large language models (LLMs) have shown exceptional performance on a variety of natural language tasks.
We contribute HTML understanding models (fine-tuned LLMs) and an in-depth analysis of their capabilities under three tasks.
We show that LLMs pretrained on standard natural language corpora transfer remarkably well to HTML understanding tasks.
arXiv Detail & Related papers (2022-10-08T07:27:17Z) - Web3 Challenges and Opportunities for the Market [0.0]
Web3 is the next generational step in the information age, where the web evolves into a more digestible medium for users and machines to browse knowledge.
The slow introduction of Web3 across the global software ecosystem will impact the people who enable the current iteration.
This evolution of the internet space will expand the way knowledge is shared, consumed, and owned.
arXiv Detail & Related papers (2022-09-06T12:17:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.