Applied Awareness: Test-Driven GUI Development using Computer Vision and
Cryptography
- URL: http://arxiv.org/abs/2006.03725v1
- Date: Fri, 5 Jun 2020 22:46:48 GMT
- Title: Applied Awareness: Test-Driven GUI Development using Computer Vision and
Cryptography
- Authors: Donald Beaver
- Abstract summary: Test-driven development is impractical: it generally requires an initial implementation of the GUI to generate golden images or to construct interactive test scenarios.
We demonstrate a novel and immediately applicable approach of interpreting GUI presentation in terms of backend communications.
This focus on backend communication circumvents deficiencies in typical testing methodologies that rely on platform-dependent UI affordances or accessibility features.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graphical user interface testing is significantly challenging, and automating
it even more so. Test-driven development is impractical: it generally requires
an initial implementation of the GUI to generate golden images or to construct
interactive test scenarios, and subsequent maintenance is costly. While
computer vision has been applied to several aspects of GUI testing, we
demonstrate a novel and immediately applicable approach of interpreting GUI
presentation in terms of backend communications, modeling "awareness" in the
fashion employed by cryptographic proofs of security. This focus on backend
communication circumvents deficiencies in typical testing methodologies that
rely on platform-dependent UI affordances or accessibility features. Our
interdisciplinary work is ready for off-the-shelf practice: we report
self-contained, practical implementation with both online and offline
validation, using simple designer specifications at the outset and specifically
avoiding any requirements for a bootstrap implementation or golden images. In
addition to practical implementation, ties to formal verification methods in
cryptography are explored and explained, providing fertile perspectives on
assurance in UI and interpretability in AI.
Related papers
- ShowUI: One Vision-Language-Action Model for GUI Visual Agent [80.50062396585004]
Building Graphical User Interface (GUI) assistants holds significant promise for enhancing human workflow productivity.
We develop a vision-language-action model in digital world, namely ShowUI, which features the following innovations.
ShowUI, a lightweight 2B model using 256K data, achieves a strong 75.1% accuracy in zero-shot screenshot grounding.
arXiv Detail & Related papers (2024-11-26T14:29:47Z) - Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping [55.98643055756135]
We introduce Sketch2Code, a benchmark that evaluates state-of-the-art Vision Language Models (VLMs) on automating the conversion of rudimentary sketches into webpage prototypes.
We analyze ten commercial and open-source models, showing that Sketch2Code is challenging for existing VLMs.
A user study with UI/UX experts reveals a significant preference for proactive question-asking over passive feedback reception.
arXiv Detail & Related papers (2024-10-21T17:39:49Z) - Self-Elicitation of Requirements with Automated GUI Prototyping [12.281152349482024]
SERGUI is a novel approach enabling the Self-Elicitation of Requirements based on an automated GUI prototyping assistant.
SerGUI exploits the vast prototyping knowledge embodied in a large-scale GUI repository through Natural Language Requirements (NLR) based GUI retrieval.
To measure the effectiveness of our approach, we conducted a preliminary evaluation.
arXiv Detail & Related papers (2024-09-24T18:40:38Z) - UNIT: Unifying Image and Text Recognition in One Vision Encoder [51.140564856352825]
UNIT is a novel training framework aimed at UNifying Image and Text recognition within a single model.
We show that UNIT significantly outperforms existing methods on document-related tasks.
Notably, UNIT retains the original vision encoder architecture, making it cost-free in terms of inference and deployment.
arXiv Detail & Related papers (2024-09-06T08:02:43Z) - GUICourse: From General Vision Language Models to Versatile GUI Agents [75.5150601913659]
We contribute GUICourse, a suite of datasets to train visual-based GUI agents.
First, we introduce the GUIEnv dataset to strengthen the OCR and grounding capabilities of VLMs.
Then, we introduce the GUIAct and GUIChat datasets to enrich their knowledge of GUI components and interactions.
arXiv Detail & Related papers (2024-06-17T08:30:55Z) - Interlinking User Stories and GUI Prototyping: A Semi-Automatic LLM-based Approach [55.762798168494726]
We present a novel Large Language Model (LLM)-based approach for validating the implementation of functional NL-based requirements in a graphical user interface (GUI) prototype.
Our approach aims to detect functional user stories that are not implemented in a GUI prototype and provides recommendations for suitable GUI components directly implementing the requirements.
arXiv Detail & Related papers (2024-06-12T11:59:26Z) - Gamified GUI testing with Selenium in the IntelliJ IDE: A Prototype Plugin [0.559239450391449]
This paper presents GIPGUT: a prototype of a gamification plugin for IntelliJ IDEA.
The plugin enhances testers' engagement with typically monotonous and tedious tasks through achievements, rewards, and profile customization.
The results indicate high usability and positive reception of the gamification elements.
arXiv Detail & Related papers (2024-03-14T20:11:11Z) - CoCo-Agent: A Comprehensive Cognitive MLLM Agent for Smartphone GUI Automation [61.68049335444254]
Multimodal large language models (MLLMs) have shown remarkable potential as human-like autonomous language agents to interact with real-world environments.
We propose a Comprehensive Cognitive LLM Agent, CoCo-Agent, with two novel approaches, comprehensive environment perception (CEP) and conditional action prediction (CAP)
With our technical design, our agent achieves new state-of-the-art performance on AITW and META-GUI benchmarks, showing promising abilities in realistic scenarios.
arXiv Detail & Related papers (2024-02-19T08:29:03Z) - Vision-Based Mobile App GUI Testing: A Survey [29.042723121518765]
Vision-based mobile app GUI testing approaches emerged with the development of computer vision technologies.
We provide a comprehensive investigation of the state-of-the-art techniques on 271 papers, among which 92 are vision-based studies.
arXiv Detail & Related papers (2023-10-20T14:04:04Z) - Effective, Platform-Independent GUI Testing via Image Embedding and Reinforcement Learning [15.458315113767686]
We propose PIRLTest, an effective platform-independent approach for app testing.
It utilizes computer vision and reinforcement learning techniques in a novel, synergistic manner for automated testing.
PILTest explores apps with the guidance of a curiosity-driven strategy, which uses a Q-network to estimate the values of specific state-action pairs.
arXiv Detail & Related papers (2022-08-19T01:51:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.