Related papers: APD-Agents: A Large Language Model-Driven Multi-Agents Collaborative Framework for Automated Page Design

APD-Agents: A Large Language Model-Driven Multi-Agents Collaborative Framework for Automated Page Design

URL: http://arxiv.org/abs/2511.14101v1
Date: Tue, 18 Nov 2025 03:39:26 GMT
Title: APD-Agents: A Large Language Model-Driven Multi-Agents Collaborative Framework for Automated Page Design
Authors: Xinpeng Chen, Xiaofeng Han, Kaihao Zhang, Guochao Ren, Yujie Wang, Wenhao Cao, Yang Zhou, Jianfeng Lu, Zhenbo Song,
Abstract summary: We propose APD-agents, a large language model driven multi-agent framework for app page design.<n>Our work fully leverages the automatic collaboration capabilities of large-model-driven multi-agent systems.<n> Experimental results on the RICO dataset show that APD-agents achieve state-of-the-art performance.
Score: 28.89702589792701
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Layout design is a crucial step in developing mobile app pages. However, crafting satisfactory designs is time-intensive for designers: they need to consider which controls and content to present on the page, and then repeatedly adjust their size, position, and style for better aesthetics and structure. Although many design software can now help to perform these repetitive tasks, extensive training is needed to use them effectively. Moreover, collaborative design across app pages demands extra time to align standards and ensure consistent styling. In this work, we propose APD-agents, a large language model (LLM) driven multi-agent framework for automated page design in mobile applications. Our framework contains OrchestratorAgent, SemanticParserAgent, PrimaryLayoutAgent, TemplateRetrievalAgent, and RecursiveComponentAgent. Upon receiving the user's description of the page, the OrchestratorAgent can dynamically can direct other agents to accomplish users' design task. To be specific, the SemanticParserAgent is responsible for converting users' descriptions of page content into structured data. The PrimaryLayoutAgent can generate an initial coarse-grained layout of this page. The TemplateRetrievalAgent can fetch semantically relevant few-shot examples and enhance the quality of layout generation. Besides, a RecursiveComponentAgent can be used to decide how to recursively generate all the fine-grained sub-elements it contains for each element in the layout. Our work fully leverages the automatic collaboration capabilities of large-model-driven multi-agent systems. Experimental results on the RICO dataset show that our APD-agents achieve state-of-the-art performance.

Related papers

Designing Domain-Specific Agents via Hierarchical Task Abstraction Mechanism [61.01709143437043]
We introduce a novel agent design framework centered on a Hierarchical Task Abstraction Mechanism (HTAM)<n>Specifically, HTAM moves beyond emulating social roles, instead structuring multi-agent systems into a logical hierarchy that mirrors the intrinsic task-dependency graph of a given domain.<n>We instantiate this framework as EarthAgent, a multi-agent system tailored for complex geospatial analysis.
arXiv Detail & Related papers (2025-11-21T12:25:47Z)
SlideAgent: Hierarchical Agentic Framework for Multi-Page Visual Document Understanding [28.839192349010048]
We introduce SlideAgent, a versatile agentic framework for understanding multi-modal, multi-page, and multi-slide documents.<n>During inference, SlideAgent selectively activates specialized agents for multi-level reasoning and integrates their outputs into coherent, context-aware answers.
arXiv Detail & Related papers (2025-10-30T15:41:15Z)
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC [98.82146219495792]
In this paper, we propose a hierarchical agent framework named PC-Agent.<n>From the perception perspective, we devise an Active Perception Module (APM) to overcome the inadequate abilities of current MLLMs in perceiving screenshot content.<n>From the decision-making perspective, to handle complex user instructions and interdependent subtasks more effectively, we propose a hierarchical multi-agent collaboration architecture.
arXiv Detail & Related papers (2025-02-20T05:41:55Z)
AgentSquare: Automatic LLM Agent Search in Modular Design Space [16.659969168343082]
Large Language Models (LLMs) have led to a rapid growth of agentic systems capable of handling a wide range of complex tasks.<n>We introduce a new research problem: Modularized LLM Agent Search (MoLAS)
arXiv Detail & Related papers (2024-10-08T15:52:42Z)
AutoGen Studio: A No-Code Developer Tool for Building and Debugging Multi-Agent Systems [31.113305753414913]
AUTOGEN STUDIO is a no-code developer tool for rapidly prototyping multi-agent systems. It provides an intuitive drag-and-drop UI for agent specification, interactive evaluation, and a gallery of reusable agent components.
arXiv Detail & Related papers (2024-08-09T03:27:37Z)
AppAgent v2: Advanced Agent for Flexible Mobile Interactions [57.98933460388985]
This work introduces a novel LLM-based multimodal agent framework for mobile devices.<n>Our agent constructs a flexible action space that enhances adaptability across various applications.<n>Our results demonstrate the framework's superior performance, confirming its effectiveness in real-world scenarios.
arXiv Detail & Related papers (2024-08-05T06:31:39Z)
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration [52.25473993987409]
We propose Mobile-Agent-v2, a multi-agent architecture for mobile device operation assistance. The architecture comprises three agents: planning agent, decision agent, and reflection agent. We show that Mobile-Agent-v2 achieves over a 30% improvement in task completion compared to the single-agent architecture.
arXiv Detail & Related papers (2024-06-03T05:50:00Z)
AgentKit: Structured LLM Reasoning with Dynamic Graphs [91.09525140733987]
We propose an intuitive LLM prompting framework (AgentKit) for multifunctional agents. AgentKit offers a unified framework for explicitly constructing a complex "thought process" from simple natural language prompts.
arXiv Detail & Related papers (2024-04-17T15:40:45Z)
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation [72.6168579583414]
CompAgent is a training-free approach for compositional text-to-image generation with a large language model (LLM) agent as its core. Our approach achieves more than 10% improvement on T2I-CompBench, a comprehensive benchmark for open-world compositional T2I generation.
arXiv Detail & Related papers (2024-01-28T16:18:39Z)
AutoAgents: A Framework for Automatic Agent Generation [27.74332323317923]
AutoAgents is an innovative framework that adaptively generates and coordinates multiple specialized agents to build an AI team according to different tasks. Our experiments on various benchmarks demonstrate that AutoAgents generates more coherent and accurate solutions than the existing multi-agent methods.
arXiv Detail & Related papers (2023-09-29T14:46:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.