Related papers: A Rule-Based Approach for UI Migration from Android to iOS

A Rule-Based Approach for UI Migration from Android to iOS

URL: http://arxiv.org/abs/2409.16656v1
Date: Wed, 25 Sep 2024 06:19:54 GMT
Title: A Rule-Based Approach for UI Migration from Android to iOS
Authors: Yi Gao, Xing Hu, Tongtong Xu, Xin Xia, Xiaohu Yang,
Abstract summary: We propose a novel approach called GUIMIGRATOR, which enables the cross platform migration of existing Android app UIs to iOS. GuiMIGRATOR extracts and parses Android UI layouts, views, and resources to construct a UI skeleton tree. GuiMIGRATOR generates the final UI code files utilizing target code templates, which are then compiled and validated in the iOS development platform.
Score: 11.229343760409044
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the mobile development process, creating the user interface (UI) is highly resource intensive. Consequently, numerous studies have focused on automating UI development, such as generating UI from screenshots or design specifications. However, they heavily rely on computer vision techniques for image recognition. Any recognition errors can cause invalid UI element generation, compromising the effectiveness of these automated approaches. Moreover, developing an app UI from scratch remains a time consuming and labor intensive task. To address this challenge, we propose a novel approach called GUIMIGRATOR, which enables the cross platform migration of existing Android app UIs to iOS, thereby automatically generating UI to facilitate the reuse of existing UI. This approach not only avoids errors from screenshot recognition but also reduces the cost of developing UIs from scratch. GUIMIGRATOR extracts and parses Android UI layouts, views, and resources to construct a UI skeleton tree. GUIMIGRATOR generates the final UI code files utilizing target code templates, which are then compiled and validated in the iOS development platform, i.e., Xcode. We evaluate the effectiveness of GUIMIGRATOR on 31 Android open source applications across ten domains. The results show that GUIMIGRATOR achieves a UI similarity score of 78 between migration screenshots, outperforming two popular existing LLMs substantially. Additionally, GUIMIGRATOR demonstrates high efficiency, taking only 7.6 seconds to migrate the datasets. These findings indicate that GUIMIGRATOR effectively facilitates the reuse of Android UI code on iOS, leveraging the strengths of both platforms UI frameworks and making new contributions to cross platform development.

Related papers

UI-TARS: Pioneering Automated GUI Interaction with Native Agents [58.18100825673032]
This paper introduces UI-TARS, a native GUI agent model that solely perceives the screenshots as input and performs human-like interactions. In the OSWorld benchmark, UI-TARS achieves scores of 24.6 with 50 steps and 22.7 with 15 steps, outperforming Claude (22.0 and 14.9 respectively)
arXiv Detail & Related papers (2025-01-21T17:48:10Z)
UITrans: Seamless UI Translation from Android to HarmonyOS [20.2752697820237]
We present UITrans, the first automated UI translation tool designed for Android to HarmonyOS. Our evaluation of six Android applications demonstrates that our UITrans translation success rates of over 90.1%, 89.3%, and 89.2% at the component, page, and project levels, respectively.
arXiv Detail & Related papers (2024-12-18T10:33:55Z)
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction [69.57190742976091]
We introduce Aguvis, a unified vision-based framework for autonomous GUI agents. Our approach leverages image-based observations, and grounding instructions in natural language to visual elements. To address the limitations of previous work, we integrate explicit planning and reasoning within the model.
arXiv Detail & Related papers (2024-12-05T18:58:26Z)
ShowUI: One Vision-Language-Action Model for GUI Visual Agent [80.50062396585004]
Building Graphical User Interface (GUI) assistants holds significant promise for enhancing human workflow productivity. We develop a vision-language-action model in digital world, namely ShowUI, which features the following innovations. ShowUI, a lightweight 2B model using 256K data, achieves a strong 75.1% accuracy in zero-shot screenshot grounding.
arXiv Detail & Related papers (2024-11-26T14:29:47Z)
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents [55.37173845836839]
OS-Atlas is a foundational GUI action model that excels at GUI grounding and OOD agentic tasks. We are releasing the largest open-source cross-platform GUI grounding corpus to date, which contains over 13 million GUI elements.
arXiv Detail & Related papers (2024-10-30T17:10:19Z)
Toward the Automated Localization of Buggy Mobile App UIs from Bug Descriptions [19.304569170230316]
The identification of buggy UI screens and UI components is important to localizing the buggy behavior and fixing it. This paper is the first to investigate the feasibility of automating the task of Buggy UI localization. We find that incorporating localized buggy UIs leads to improvements of 9%-12% in Hits@10.
arXiv Detail & Related papers (2024-08-07T20:26:20Z)
AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents [50.39555842254652]
We introduce the Android Multi-annotation EXpo (AMEX) to advance research on AI agents in mobile scenarios. AMEX comprises over 104K high-resolution screenshots from 110 popular mobile applications, which are annotated at multiple levels. AMEX includes three levels of annotations: GUI interactive element grounding, GUI screen and element functionality descriptions, and complex natural language instructions.
arXiv Detail & Related papers (2024-07-03T17:59:58Z)
Automatically Generating UI Code from Screenshot: A Divide-and-Conquer-Based Approach [51.522121376987634]
We propose DCGen, a divide-and-based approach to automate the translation of webpage design to UI code. DCGen starts by dividing screenshots into manageable segments, generating descriptions for each segment, and then reassembling them into complete UI code for the entire screenshot. We conduct extensive testing with a dataset comprised of real-world websites and various MLLMs and demonstrate that DCGen achieves up to a 14% improvement in visual similarity over competing methods.
arXiv Detail & Related papers (2024-06-24T07:58:36Z)
On AI-Inspired UI-Design [5.969881132928718]
We discuss three major complementary approaches on how to use Artificial Intelligence (AI) to support app designers create better, more diverse, and creative UI of mobile apps. First, designers can prompt a Large Language Model (LLM) like GPT to directly generate and adjust one or multiple UIs. Second, a Vision-Language Model (VLM) enables designers to effectively search a large screenshot dataset, e.g. from apps published in app stores. Third, a Diffusion Model (DM) specifically designed to generate app UIs as inspirational images.
arXiv Detail & Related papers (2024-06-19T15:28:21Z)
Tell Me What's Next: Textual Foresight for Generic UI Representations [65.10591722192609]
We propose Textual Foresight, a novel pretraining objective for learning UI screen representations. Textual Foresight generates global text descriptions of future UI states given a current UI and local action taken. We train with our newly constructed mobile app dataset, OpenApp, which results in the first public dataset for app UI representation learning.
arXiv Detail & Related papers (2024-06-12T02:43:19Z)
UI Semantic Group Detection: Grouping UI Elements with Similar Semantics in Mobile Graphical User Interface [10.80156450091773]
Existing studies on UI elements grouping mainly focus on a single UI-related software engineering task, and their groups vary in appearance and function. We propose our semantic component groups that pack adjacent text and non-text elements with similar semantics. To recognize semantic component groups on a UI page, we propose a robust, deep learning-based vision detector, UISCGD.
arXiv Detail & Related papers (2024-03-08T01:52:44Z)
Spotlight: Mobile UI Understanding using Vision-Language Models with a Focus [9.401663915424008]
We propose a vision-language model that only takes the screenshot of the UI and a region of interest on the screen as the input. Our experiments show that our model obtains SoTA results on several representative UI tasks and outperforms previous methods.
arXiv Detail & Related papers (2022-09-29T16:45:43Z)
VINS: Visual Search for Mobile User Interface Design [66.28088601689069]
This paper introduces VINS, a visual search framework, that takes as input a UI image and retrieves visually similar design examples. The framework achieves a mean Average Precision of 76.39% for the UI detection and high performance in querying similar UI designs.
arXiv Detail & Related papers (2021-02-10T01:46:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.