Automated Mapping of Adaptive App GUIs from Phones to TVs
- URL: http://arxiv.org/abs/2307.12522v2
- Date: Sun, 5 Nov 2023 04:31:34 GMT
- Title: Automated Mapping of Adaptive App GUIs from Phones to TVs
- Authors: Han Hu, Ruiqi Dong, John Grundy, Thai Minh Nguyen, Huaxiao Liu,
Chunyang Chen
- Abstract summary: Existing techniques to map a mobile app GUI to a TV either adopt a responsive design or use mirror apps for improved video display.
We propose a semi-automated approach to generate corresponding adaptive TV GUIs, given the phone GUIs as the input.
Our tool is not only beneficial to developers but also to GUI designers, who can further customize the generated GUIs for their TV app development.
- Score: 31.207923538204795
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the increasing interconnection of smart devices, users often desire to
adopt the same app on quite different devices for identical tasks, such as
watching the same movies on both their smartphones and TVs. However, the
significant differences in screen size, aspect ratio, and interaction styles
make it challenging to adapt Graphical User Interfaces (GUIs) across these
devices. Although there are millions of apps available on Google Play, only a
few thousand are designed to support smart TV displays. Existing techniques to
map a mobile app GUI to a TV either adopt a responsive design, which struggles
to bridge the substantial gap between phone and TV or use mirror apps for
improved video display, which requires hardware support and extra engineering
efforts. Instead of developing another app for supporting TVs, we propose a
semi-automated approach to generate corresponding adaptive TV GUIs, given the
phone GUIs as the input. Based on our empirical study of GUI pairs for TVs and
phones in existing apps, we synthesize a list of rules for grouping and
classifying phone GUIs, converting them to TV GUIs, and generating dynamic TV
layouts and source code for the TV display. Our tool is not only beneficial to
developers but also to GUI designers, who can further customize the generated
GUIs for their TV app development. An evaluation and user study demonstrate the
accuracy of our generated GUIs and the usefulness of our tool.
Related papers
- MobileFlow: A Multimodal LLM For Mobile GUI Agent [4.7619361168442005]
This paper introduces MobileFlow, a multimodal large language model meticulously crafted for mobile GUI agents.
MobileFlow contains approximately 21 billion parameters and is equipped with novel hybrid visual encoders.
It has the capacity to fully interpret image data and comprehend user instructions for GUI interaction tasks.
arXiv Detail & Related papers (2024-07-05T08:37:10Z) - AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents [50.39555842254652]
We introduce the Android Multi-annotation EXpo (AMEX) to advance research on AI agents in mobile scenarios.
AMEX comprises over 104K high-resolution screenshots from 110 popular mobile applications, which are annotated at multiple levels.
AMEX includes three levels of annotations: GUI interactive element grounding, GUI screen and element functionality descriptions, and complex natural language instructions.
arXiv Detail & Related papers (2024-07-03T17:59:58Z) - GUICourse: From General Vision Language Models to Versatile GUI Agents [75.5150601913659]
We contribute GUICourse, a suite of datasets to train visual-based GUI agents.
First, we introduce the GUIEnv dataset to strengthen the OCR and grounding capabilities of VLMs.
Then, we introduce the GUIAct and GUIChat datasets to enrich their knowledge of GUI components and interactions.
arXiv Detail & Related papers (2024-06-17T08:30:55Z) - GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents [73.9254861755974]
This paper introduces a new dataset, called GUI-World, which features meticulously crafted Human-MLLM annotations.
We evaluate the capabilities of current state-of-the-art MLLMs, including ImageLLMs and VideoLLMs, in understanding various types of GUI content.
arXiv Detail & Related papers (2024-06-16T06:56:53Z) - GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices [61.48043339441149]
GUI Odyssey consists of 7,735 episodes from 6 mobile devices, spanning 6 types of cross-app tasks, 201 apps, and 1.4K app combos.
We developed OdysseyAgent, a multimodal cross-app navigation agent by fine-tuning the Qwen-VL model with a history resampling module.
arXiv Detail & Related papers (2024-06-12T17:44:26Z) - GUing: A Mobile GUI Search Engine using a Vision-Language Model [6.024602799136753]
This paper proposes GUing, a GUI search engine based on a vision-language model called GUIClip.
We first collected from Google Play app introduction images which display the most representative screenshots.
Then, we developed an automated pipeline to classify, crop, and extract the captions from these images.
We used this dataset to train a novel vision-language model, which is, to the best of our knowledge, the first of its kind for GUI retrieval.
arXiv Detail & Related papers (2024-04-30T18:42:18Z) - GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone
GUI Navigation [167.6232690168905]
MM-Navigator is a GPT-4V-based agent for the smartphone graphical user interface (GUI) navigation task.
MM-Navigator can interact with a smartphone screen as human users, and determine subsequent actions to fulfill given instructions.
arXiv Detail & Related papers (2023-11-13T18:53:37Z) - Pairwise GUI Dataset Construction Between Android Phones and Tablets [24.208087862974033]
Papt dataset is a pairwise GUI dataset tailored for Android phones and tablets.
We propose novel pairwise GUI collection approaches for constructing this dataset.
arXiv Detail & Related papers (2023-10-07T09:30:42Z) - A Pairwise Dataset for GUI Conversion and Retrieval between Android
Phones and Tablets [24.208087862974033]
Papt dataset is a pairwise dataset for GUI conversion and retrieval between Android phones and tablets.
dataset contains 10,035 phone-tablet GUI page pairs from 5,593 phone-tablet app pairs.
arXiv Detail & Related papers (2023-07-25T03:25:56Z) - NiCro: Purely Vision-based, Non-intrusive Cross-Device and
Cross-Platform GUI Testing [19.462053492572142]
We propose a non-intrusive cross-device and cross-platform system NiCro.
NiCro uses the state-of-the-art GUI widget detector to detect widgets from GUI images and then analyses a set of comprehensive information to match the widgets across diverse devices.
At the system level, NiCro can interact with a virtual device farm and a robotic arm system to perform cross-device, cross-platform testing non-intrusively.
arXiv Detail & Related papers (2023-05-24T01:19:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.