Related papers: UI Layers Merger: Merging UI layers via Visual Learning and Boundary Prior

UI Layers Merger: Merging UI layers via Visual Learning and Boundary Prior

URL: http://arxiv.org/abs/2206.13389v1
Date: Sat, 18 Jun 2022 16:09:28 GMT
Title: UI Layers Merger: Merging UI layers via Visual Learning and Boundary Prior
Authors: Yun-nong Chen, Yan-kun Zhen, Chu-ning Shi, Jia-zhi Li, Ting-ting Zhou, Yan-fang Chang, Ling-yun Sun, Liu-qing Chen
Abstract summary: fragmented layers inevitably appear in the UI design drafts which greatly reduces the quality of code generation. We propose UI Layers Merger (UILM), a vision-based method, which can automatically detect and merge fragmented layers into UI components.
Score: 7.251022347055101
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the fast-growing GUI development workload in the Internet industry, some work on intelligent methods attempted to generate maintainable front-end code from UI screenshots. It can be more suitable for utilizing UI design drafts that contain UI metadata. However, fragmented layers inevitably appear in the UI design drafts which greatly reduces the quality of code generation. None of the existing GUI automated techniques detects and merges the fragmented layers to improve the accessibility of generated code. In this paper, we propose UI Layers Merger (UILM), a vision-based method, which can automatically detect and merge fragmented layers into UI components. Our UILM contains Merging Area Detector (MAD) and a layers merging algorithm. MAD incorporates the boundary prior knowledge to accurately detect the boundaries of UI components. Then, the layers merging algorithm can search out the associated layers within the components' boundaries and merge them into a whole part. We present a dynamic data augmentation approach to boost the performance of MAD. We also construct a large-scale UI dataset for training the MAD and testing the performance of UILM. The experiment shows that the proposed method outperforms the best baseline regarding merging area detection and achieves a decent accuracy regarding layers merging.

Related papers

DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode [47.32061459437175]
We introduce DreamLayer, a framework that enables coherent text-driven generation of multiple image layers. By explicitly modeling the relationship between transparent foreground and background layers, DreamLayer builds inter-layer connections. Experiments and user studies demonstrate that DreamLayer generates more coherent and well-aligned layers.
arXiv Detail & Related papers (2025-03-17T05:34:11Z)
AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs [54.58905728115257]
We propose the methodname pipeline for automatically annotating UI elements with detailed functionality descriptions at scale. Specifically, we leverage large language models (LLMs) to infer element functionality by comparing the UI content changes before and after simulated interactions with specific UI elements. We construct an methodname-704k dataset using the proposed pipeline, featuring multi-resolution, multi-device screenshots, diverse data domains, and detailed functionality annotations that have never been provided by previous datasets.
arXiv Detail & Related papers (2025-02-04T03:39:59Z)
Fragmented Layer Grouping in GUI Designs Through Graph Learning Based on Multimodal Information [12.302861965706885]
In the industrial GUI-to-code process, fragmented layers may decrease the readability and maintainability of generated code. This study proposes a graph-learning-based approach to tackle the fragmented layer grouping problem according to multi-modal information in design prototypes.
arXiv Detail & Related papers (2024-12-07T06:31:09Z)
A Rule-Based Approach for UI Migration from Android to iOS [11.229343760409044]
We propose a novel approach called GUIMIGRATOR, which enables the cross platform migration of existing Android app UIs to iOS. GuiMIGRATOR extracts and parses Android UI layouts, views, and resources to construct a UI skeleton tree. GuiMIGRATOR generates the final UI code files utilizing target code templates, which are then compiled and validated in the iOS development platform.
arXiv Detail & Related papers (2024-09-25T06:19:54Z)
Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection [57.883265488038134]
We propose a hierarchical graph interaction network termed HGINet for camouflaged object detection. The network is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features. Our experiments demonstrate the superior performance of HGINet compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-08-27T12:53:25Z)
Tell Me What's Next: Textual Foresight for Generic UI Representations [65.10591722192609]
We propose Textual Foresight, a novel pretraining objective for learning UI screen representations. Textual Foresight generates global text descriptions of future UI states given a current UI and local action taken. We train with our newly constructed mobile app dataset, OpenApp, which results in the first public dataset for app UI representation learning.
arXiv Detail & Related papers (2024-06-12T02:43:19Z)
ILuvUI: Instruction-tuned LangUage-Vision modeling of UIs from Machine Conversations [13.939350184164017]
Multimodal Vision-Language Models (VLMs) enable powerful applications from their fused understanding of images and language. We adapt a recipe for generating paired text-image training data for VLMs to the UI domain by combining existing pixel-based methods with a Large Language Model (LLM) We generate a dataset of 335K conversational examples paired with UIs that cover Q&A, UI descriptions, and planning, and use it to fine-tune a conversational VLM for UI tasks.
arXiv Detail & Related papers (2023-10-07T16:32:34Z)
UI Layers Group Detector: Grouping UI Layers via Text Fusion and Box Attention [7.614630088064978]
We propose a vision-based method that automatically detects images (i.e., basic shapes and visual elements) and text layers that present the same semantic meanings. We construct a large-scale UI dataset for training and testing, and present a data augmentation approach to boost the detection performance.
arXiv Detail & Related papers (2022-12-07T03:50:20Z)
ULDGNN: A Fragmented UI Layer Detector Based on Graph Neural Networks [7.614630088064978]
fragmented layers could degrade the code quality without being merged into a whole part if all of them are involved in the code generation. In this paper, we propose a pipeline to merge fragmented layers automatically. Our approach can retrieve most fragmented layers in UI design drafts, and achieve 87% accuracy in the detection task.
arXiv Detail & Related papers (2022-08-13T14:14:37Z)
VINS: Visual Search for Mobile User Interface Design [66.28088601689069]
This paper introduces VINS, a visual search framework, that takes as input a UI image and retrieves visually similar design examples. The framework achieves a mean Average Precision of 76.39% for the UI detection and high performance in querying similar UI designs.
arXiv Detail & Related papers (2021-02-10T01:46:33Z)
Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios. We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z)
Saliency Enhancement using Gradient Domain Edges Merging [65.90255950853674]
We develop a method to merge the edges with the saliency maps to improve the performance of the saliency. This leads to our proposed saliency enhancement using edges (SEE) with an average improvement of at least 3.4 times higher on the DUT-OMRON dataset. The SEE algorithm is split into 2 parts, SEE-Pre for preprocessing and SEE-Post pour postprocessing.
arXiv Detail & Related papers (2020-02-11T14:04:56Z)
Convolutional Networks with Dense Connectivity [59.30634544498946]
We introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks.
arXiv Detail & Related papers (2020-01-08T06:54:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.