AXNav: Replaying Accessibility Tests from Natural Language
- URL: http://arxiv.org/abs/2310.02424v3
- Date: Tue, 5 Mar 2024 01:28:25 GMT
- Title: AXNav: Replaying Accessibility Tests from Natural Language
- Authors: Maryam Taeb, Amanda Swearngin, Eldon Schoop, Ruijia Cheng, Yue Jiang,
Jeffrey Nichols
- Abstract summary: Large Language Models (LLMs) have been used for a variety of tasks including automation of UIs.
This paper explores the requirements of a natural language based accessibility testing workflow.
We build a system that takes as input a manual accessibility test (e.g., Search for a show in VoiceOver'') and uses an LLM combined with pixel-based UI Understanding models to execute the test.
- Score: 14.131076040673351
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Developers and quality assurance testers often rely on manual testing to test
accessibility features throughout the product lifecycle. Unfortunately, manual
testing can be tedious, often has an overwhelming scope, and can be difficult
to schedule amongst other development milestones. Recently, Large Language
Models (LLMs) have been used for a variety of tasks including automation of
UIs, however to our knowledge no one has yet explored their use in controlling
assistive technologies for the purposes of supporting accessibility testing. In
this paper, we explore the requirements of a natural language based
accessibility testing workflow, starting with a formative study. From this we
build a system that takes as input a manual accessibility test (e.g., ``Search
for a show in VoiceOver'') and uses an LLM combined with pixel-based UI
Understanding models to execute the test and produce a chaptered, navigable
video. In each video, to help QA testers we apply heuristics to detect and flag
accessibility issues (e.g., Text size not increasing with Large Text enabled,
VoiceOver navigation loops). We evaluate this system through a 10 participant
user study with accessibility QA professionals who indicated that the tool
would be very useful in their current work and performed tests similarly to how
they would manually test the features. The study also reveals insights for
future work on using LLMs for accessibility testing.
Related papers
- Multi-language Unit Test Generation using LLMs [6.259245181881262]
We describe a generic pipeline that incorporates static analysis to guide LLMs in generating compilable and high-coverage test cases.
We show how the pipeline can be applied to different programming languages, specifically Java and Python, and to complex software requiring environment mocking.
Our results demonstrate that LLM-based test generation, when guided by static analysis, can be competitive with, and even outperform, state-of-the-art test-generation techniques in coverage achieved.
arXiv Detail & Related papers (2024-09-04T21:46:18Z) - Learning to Ask: When LLMs Meet Unclear Instruction [49.256630152684764]
Large language models (LLMs) can leverage external tools for addressing a range of tasks unattainable through language skills alone.
We evaluate the performance of LLMs tool-use under imperfect instructions, analyze the error patterns, and build a challenging tool-use benchmark called Noisy ToolBench.
We propose a novel framework, Ask-when-Needed (AwN), which prompts LLMs to ask questions to users whenever they encounter obstacles due to unclear instructions.
arXiv Detail & Related papers (2024-08-31T23:06:12Z) - Protecting Copyrighted Material with Unique Identifiers in Large Language Model Training [55.321010757641524]
A major public concern regarding the training of large language models (LLMs) is whether they abusing copyrighted online text.
Previous membership inference methods may be misled by similar examples in vast amounts of training data.
We propose an alternative textitinsert-and-detection methodology, advocating that web users and content platforms employ textbftextitunique identifiers.
arXiv Detail & Related papers (2024-03-23T06:36:32Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques.
We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - Are We Testing or Being Tested? Exploring the Practical Applications of
Large Language Models in Software Testing [0.0]
A Large Language Model (LLM) represents a cutting-edge artificial intelligence model that generates coherent content.
LLM can play a pivotal role in software development, including software testing.
This study explores the practical application of LLMs in software testing within an industrial setting.
arXiv Detail & Related papers (2023-12-08T06:30:37Z) - Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI
Testing via Functionality-aware Decisions [23.460051600514806]
GPTDroid is a Q&A-based GUI testing framework for mobile apps.
We introduce a functionality-aware memory prompting mechanism.
It outperforms the best baseline by 32% in activity coverage, and detects 31% more bugs at a faster rate.
arXiv Detail & Related papers (2023-10-24T12:30:26Z) - LLM for Test Script Generation and Migration: Challenges, Capabilities,
and Opportunities [8.504639288314063]
Test script generation is a vital component of software testing, enabling efficient and reliable automation of repetitive test tasks.
Existing generation approaches often encounter limitations, such as difficulties in accurately capturing and reproducing test scripts across diverse devices, platforms, and applications.
This paper investigates the application of large language models (LLM) in the domain of mobile application test script generation.
arXiv Detail & Related papers (2023-09-24T07:58:57Z) - Towards Autonomous Testing Agents via Conversational Large Language
Models [18.302956037305112]
Large language models (LLMs) can be used as automated testing assistants.
We present a taxonomy of LLM-based testing agents based on their level of autonomy.
arXiv Detail & Related papers (2023-06-08T12:22:38Z) - BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models [73.29106813131818]
bias testing is currently cumbersome since the test sentences are generated from a limited set of manual templates or need expensive crowd-sourcing.
We propose using ChatGPT for the controllable generation of test sentences, given any arbitrary user-specified combination of social groups and attributes.
We present an open-source comprehensive bias testing framework (BiasTestGPT), hosted on HuggingFace, that can be plugged into any open-source PLM for bias testing.
arXiv Detail & Related papers (2023-02-14T22:07:57Z) - Towards Informed Design and Validation Assistance in Computer Games
Using Imitation Learning [65.12226891589592]
This paper proposes a new approach to automated game validation and testing.
Our method leverages a data-driven imitation learning technique, which requires little effort and time and no knowledge of machine learning or programming.
arXiv Detail & Related papers (2022-08-15T11:08:44Z) - UKP-SQUARE: An Online Platform for Question Answering Research [50.35348764297317]
We present UKP-SQUARE, an online QA platform for researchers which allows users to query and analyze a large collection of modern Skills.
UKP-SQUARE allows users to query and analyze a large collection of modern Skills via a user-friendly web interface and integrated tests.
arXiv Detail & Related papers (2022-03-25T15:00:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.