Practical Non-Intrusive GUI Exploration Testing with Visual-based
Robotic Arms
- URL: http://arxiv.org/abs/2312.10655v1
- Date: Sun, 17 Dec 2023 09:05:39 GMT
- Title: Practical Non-Intrusive GUI Exploration Testing with Visual-based
Robotic Arms
- Authors: Shengcheng Yu, Chunrong Fang, Mingzhe Du, Yuchen Ling, Zhenyu Chen,
Zhendong Su
- Abstract summary: We propose a practical non-intrusive GUI testing framework with visual robotic arms.
RoboTest integrates novel GUI screen and widget detection algorithms, adaptive to detecting screens of different sizes.
We evaluate RoboTest with 20 mobile apps, with a case study on an embedded system.
- Score: 14.3266199543725
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: GUI testing is significant in the SE community. Most existing frameworks are
intrusive and only support some specific platforms. With the development of
distinct scenarios, diverse embedded systems or customized operating systems on
different devices do not support existing intrusive GUI testing frameworks.
Some approaches adopt robotic arms to replace the interface invoking of mobile
apps under test and use computer vision technologies to identify GUI elements.
However, some challenges are unsolved. First, existing approaches assume that
GUI screens are fixed so that they cannot be adapted to diverse systems with
different screen conditions. Second, existing approaches use XY-plane robotic
arms, which cannot flexibly simulate testing operations. Third, existing
approaches ignore compatibility bugs and only focus on crash bugs. A more
practical approach is required for the non-intrusive scenario. We propose a
practical non-intrusive GUI testing framework with visual robotic arms.
RoboTest integrates novel GUI screen and widget detection algorithms, adaptive
to detecting screens of different sizes and then to extracting GUI widgets from
the detected screens. Then, a set of testing operations is applied with a 4-DOF
robotic arm, which effectively and flexibly simulates human testing operations.
During app exploration, RoboTest integrates the Principle of Proximity-guided
exploration strategy, choosing close widgets of the previous targets to reduce
robotic arm movement overhead and improve exploration efficiency. RoboTest can
effectively detect some compatibility bugs beyond crash bugs with a GUI
comparison on different devices of the same test operations. We evaluate
RoboTest with 20 mobile apps, with a case study on an embedded system. The
results show that RoboTest can effectively, efficiently, and generally explore
AUTs to find bugs and reduce exploration time overhead.
Related papers
- GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration [56.58744345634623]
We propose GUI-Bee, an MLLM-based autonomous agent, to collect high-quality, environment-specific data through exploration.
We also introduce NovelScreenSpot, a benchmark for testing how well the data can help align GUI action grounding models to novel environments.
arXiv Detail & Related papers (2025-01-23T18:16:21Z) - UI-TARS: Pioneering Automated GUI Interaction with Native Agents [58.18100825673032]
This paper introduces UI-TARS, a native GUI agent model that solely perceives the screenshots as input and performs human-like interactions.
In the OSWorld benchmark, UI-TARS achieves scores of 24.6 with 50 steps and 22.7 with 15 steps, outperforming Claude (22.0 and 14.9 respectively)
arXiv Detail & Related papers (2025-01-21T17:48:10Z) - GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent [24.97846085313314]
We propose a formalized and comprehensive environment to evaluate the entire process of automated GUI Testing.
We divide the testing process into three key subtasks: test intention generation, test task execution, and GUI defect detection.
It evaluates the performance of different models using three data types: real mobile applications, mobile applications with artificially injected defects, and synthetic data.
arXiv Detail & Related papers (2024-12-24T13:41:47Z) - Falcon-UI: Understanding GUI Before Following User Instructions [57.67308498231232]
We introduce an instruction-free GUI navigation dataset, termed Insight-UI dataset, to enhance model comprehension of GUI environments.
Insight-UI dataset is automatically generated from the Common Crawl corpus, simulating various platforms.
We develop the GUI agent model Falcon-UI, which is initially pretrained on Insight-UI dataset and subsequently fine-tuned on Android and Web GUI datasets.
arXiv Detail & Related papers (2024-12-12T15:29:36Z) - Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction [69.57190742976091]
We introduce Aguvis, a unified vision-based framework for autonomous GUI agents.
Our approach leverages image-based observations, and grounding instructions in natural language to visual elements.
To address the limitations of previous work, we integrate explicit planning and reasoning within the model.
arXiv Detail & Related papers (2024-12-05T18:58:26Z) - Seeing is Believing: Vision-driven Non-crash Functional Bug Detection for Mobile Apps [26.96558418166514]
This paper proposes a novel vision-driven, multi-agent collaborative automated GUI testing approach for detecting non-crash functional bugs.
We evaluate Trident on 590 non-crash bugs and compare it with 12 baselines, it can achieve more than 14%-112% and 108%-147% boost in average recall and precision.
arXiv Detail & Related papers (2024-07-03T11:58:09Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - Vision-Based Mobile App GUI Testing: A Survey [29.042723121518765]
Vision-based mobile app GUI testing approaches emerged with the development of computer vision technologies.
We provide a comprehensive investigation of the state-of-the-art techniques on 271 papers, among which 92 are vision-based studies.
arXiv Detail & Related papers (2023-10-20T14:04:04Z) - NiCro: Purely Vision-based, Non-intrusive Cross-Device and
Cross-Platform GUI Testing [19.462053492572142]
We propose a non-intrusive cross-device and cross-platform system NiCro.
NiCro uses the state-of-the-art GUI widget detector to detect widgets from GUI images and then analyses a set of comprehensive information to match the widgets across diverse devices.
At the system level, NiCro can interact with a virtual device farm and a robotic arm system to perform cross-device, cross-platform testing non-intrusively.
arXiv Detail & Related papers (2023-05-24T01:19:05Z) - Effective, Platform-Independent GUI Testing via Image Embedding and Reinforcement Learning [15.458315113767686]
We propose PIRLTest, an effective platform-independent approach for app testing.
It utilizes computer vision and reinforcement learning techniques in a novel, synergistic manner for automated testing.
PILTest explores apps with the guidance of a curiosity-driven strategy, which uses a Q-network to estimate the values of specific state-action pairs.
arXiv Detail & Related papers (2022-08-19T01:51:16Z) - Projection Mapping Implementation: Enabling Direct Externalization of
Perception Results and Action Intent to Improve Robot Explainability [62.03014078810652]
Existing research on non-verbal cues, e.g., eye gaze or arm movement, may not accurately present a robot's internal states.
Projecting the states directly onto a robot's operating environment has the advantages of being direct, accurate, and more salient.
arXiv Detail & Related papers (2020-10-05T18:16:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.