Related papers: On AI-Inspired UI-Design

On AI-Inspired UI-Design

URL: http://arxiv.org/abs/2406.13631v1
Date: Wed, 19 Jun 2024 15:28:21 GMT
Title: On AI-Inspired UI-Design
Authors: Jialiang Wei, Anne-Lise Courbis, Thomas Lambolais, Gérard Dray, Walid Maalej,
Abstract summary: We discuss three major complementary approaches on how to use Artificial Intelligence (AI) to support app designers create better, more diverse, and creative UI of mobile apps. First, designers can prompt a Large Language Model (LLM) like GPT to directly generate and adjust one or multiple UIs. Second, a Vision-Language Model (VLM) enables designers to effectively search a large screenshot dataset, e.g. from apps published in app stores. Third, a Diffusion Model (DM) specifically designed to generate app UIs as inspirational images.
Score: 5.969881132928718
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graphical User Interface (or simply UI) is a primary mean of interaction between users and their device. In this paper, we discuss three major complementary approaches on how to use Artificial Intelligence (AI) to support app designers create better, more diverse, and creative UI of mobile apps. First, designers can prompt a Large Language Model (LLM) like GPT to directly generate and adjust one or multiple UIs. Second, a Vision-Language Model (VLM) enables designers to effectively search a large screenshot dataset, e.g. from apps published in app stores. The third approach is to train a Diffusion Model (DM) specifically designed to generate app UIs as inspirational images. We discuss how AI should be used, in general, to inspire and assist creative app design rather than automating it.

Related papers

Rethinking Layered Graphic Design Generation with a Top-Down Approach [76.33538798060326]
Graphic design is crucial for conveying ideas and messages. Designers usually organize their work into objects, backgrounds, and vectorized text layers to simplify editing.<n>With the rise of GenAI methods, an endless supply of high-quality graphic designs in pixel format has become more accessible.<n>Despite this, non-layered designs still inspire human designers, influencing their choices in layouts and text styles, ultimately guiding the creation of layered designs.<n>Motivated by this observation, we propose Accordion, a graphic design generation framework taking the first attempt to convert AI-generated designs into editable layered designs.
arXiv Detail & Related papers (2025-07-08T02:26:08Z)
Do MLLMs Capture How Interfaces Guide User Behavior? A Benchmark for Multimodal UI/UX Design Understanding [45.81445929920235]
We introduce WiserUI-Bench, a novel benchmark for assessing models' multimodal understanding of UI/UX design.<n>It includes 300 diverse real-world UI image pairs, each consisting of two design variants A/B-tested at scale by actual companies.<n>Our benchmark supports two core tasks: (1) selecting the more effective UI/UX design by predicting the A/B test verified winner and (2) assessing how well a model, given the winner, can explain its effectiveness in alignment with expert reasoning.
arXiv Detail & Related papers (2025-05-08T08:00:32Z)
Leveraging Multimodal LLM for Inspirational User Interface Search [12.470067381902972]
Existing AI-based UI search methods often miss crucial semantics like target users or the mood of apps. We used a multimodal large language model (MLLM) to extract and interpret semantics from mobile UI images. Our approach significantly outperforms existing UI retrieval methods, offering UI designers a more enriched and contextually relevant search experience.
arXiv Detail & Related papers (2025-01-29T17:38:39Z)
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction [69.57190742976091]
We introduce Aguvis, a unified vision-based framework for autonomous GUI agents. Our approach leverages image-based observations, and grounding instructions in natural language to visual elements. To address the limitations of previous work, we integrate explicit planning and reasoning within the model.
arXiv Detail & Related papers (2024-12-05T18:58:26Z)
ShowUI: One Vision-Language-Action Model for GUI Visual Agent [80.50062396585004]
Building Graphical User Interface (GUI) assistants holds significant promise for enhancing human workflow productivity. We develop a vision-language-action model in digital world, namely ShowUI, which features the following innovations. ShowUI, a lightweight 2B model using 256K data, achieves a strong 75.1% accuracy in zero-shot screenshot grounding.
arXiv Detail & Related papers (2024-11-26T14:29:47Z)
Survey of User Interface Design and Interaction Techniques in Generative AI Applications [79.55963742878684]
We aim to create a compendium of different user-interaction patterns that can be used as a reference for designers and developers alike. We also strive to lower the entry barrier for those attempting to learn more about the design of generative AI applications.
arXiv Detail & Related papers (2024-10-28T23:10:06Z)
Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping [55.98643055756135]
We introduce Sketch2Code, a benchmark that evaluates state-of-the-art Vision Language Models (VLMs) on automating the conversion of rudimentary sketches into webpage prototypes. We analyze ten commercial and open-source models, showing that Sketch2Code is challenging for existing VLMs. A user study with UI/UX experts reveals a significant preference for proactive question-asking over passive feedback reception.
arXiv Detail & Related papers (2024-10-21T17:39:49Z)
A Rule-Based Approach for UI Migration from Android to iOS [11.229343760409044]
We propose a novel approach called GUIMIGRATOR, which enables the cross platform migration of existing Android app UIs to iOS. GuiMIGRATOR extracts and parses Android UI layouts, views, and resources to construct a UI skeleton tree. GuiMIGRATOR generates the final UI code files utilizing target code templates, which are then compiled and validated in the iOS development platform.
arXiv Detail & Related papers (2024-09-25T06:19:54Z)
Tell Me What's Next: Textual Foresight for Generic UI Representations [65.10591722192609]
We propose Textual Foresight, a novel pretraining objective for learning UI screen representations. Textual Foresight generates global text descriptions of future UI states given a current UI and local action taken. We train with our newly constructed mobile app dataset, OpenApp, which results in the first public dataset for app UI representation learning.
arXiv Detail & Related papers (2024-06-12T02:43:19Z)
PromptInfuser: How Tightly Coupling AI and UI Design Impacts Designers' Workflows [23.386764579779538]
We investigate how coupling prompt and UI design affects designers' AI iteration. Grounding this research, we developed PromptInfuser, a Figma plugin that enables users to create mockups. In a study with 14 designers, we compare PromptInfuser to designers' current AI-prototyping workflow.
arXiv Detail & Related papers (2023-10-24T01:04:27Z)
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning [65.60607895153692]
MiniGPT-v2 is a model that can be treated as a unified interface for better handling various vision-language tasks. We propose using unique identifiers for different tasks when training the model. Our results show that MiniGPT-v2 achieves strong performance on many visual question-answering and visual grounding benchmarks.
arXiv Detail & Related papers (2023-10-14T03:22:07Z)
Spotlight: Mobile UI Understanding using Vision-Language Models with a Focus [9.401663915424008]
We propose a vision-language model that only takes the screenshot of the UI and a region of interest on the screen as the input. Our experiments show that our model obtains SoTA results on several representative UI tasks and outperforms previous methods.
arXiv Detail & Related papers (2022-09-29T16:45:43Z)
How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models [29.420160518026496]
We discuss the opportunities and challenges for interactive creative applications that use prompting as a new paradigm for Human-AI interaction. Based on our analysis, we propose four design goals for user interfaces that support prompting. We illustrate these with concrete UI design sketches, focusing on the use case of creative writing.
arXiv Detail & Related papers (2022-09-03T10:16:34Z)
VINS: Visual Search for Mobile User Interface Design [66.28088601689069]
This paper introduces VINS, a visual search framework, that takes as input a UI image and retrieves visually similar design examples. The framework achieves a mean Average Precision of 76.39% for the UI detection and high performance in querying similar UI designs.
arXiv Detail & Related papers (2021-02-10T01:46:33Z)
BlackBox Toolkit: Intelligent Assistance to UI Design [9.749560288448114]
We propose to modify the UI design process by assisting it with artificial intelligence (AI) We propose to enable AI to perform repetitive tasks for the designer while allowing the designer to take command of the creative process.
arXiv Detail & Related papers (2020-04-04T14:50:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.