Improving web element localization by using a large language model
- URL: http://arxiv.org/abs/2310.02046v1
- Date: Tue, 3 Oct 2023 13:39:22 GMT
- Title: Improving web element localization by using a large language model
- Authors: Michel Nass, Emil Alegroth, Robert Feldt
- Abstract summary: Large Language Models (LLMs) can show human-like reasoning abilities on some tasks.
This paper introduces and evaluates VON Similo LLM, an enhanced web element localization approach.
- Score: 6.126394204968227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Web-based test automation heavily relies on accurately finding web elements.
Traditional methods compare attributes but don't grasp the context and meaning
of elements and words. The emergence of Large Language Models (LLMs) like
GPT-4, which can show human-like reasoning abilities on some tasks, offers new
opportunities for software engineering and web element localization. This paper
introduces and evaluates VON Similo LLM, an enhanced web element localization
approach. Using an LLM, it selects the most likely web element from the
top-ranked ones identified by the existing VON Similo method, ideally aiming to
get closer to human-like selection accuracy. An experimental study was
conducted using 804 web element pairs from 48 real-world web applications. We
measured the number of correctly identified elements as well as the execution
times, comparing the effectiveness and efficiency of VON Similo LLM against the
baseline algorithm. In addition, motivations from the LLM were recorded and
analyzed for all instances where the original approach failed to find the right
web element. VON Similo LLM demonstrated improved performance, reducing failed
localizations from 70 to 39 (out of 804), a 44 percent reduction. Despite its
slower execution time and additional costs of using the GPT-4 model, the LLMs
human-like reasoning showed promise in enhancing web element localization. LLM
technology can enhance web element identification in GUI test automation,
reducing false positives and potentially lowering maintenance costs. However,
further research is necessary to fully understand LLMs capabilities,
limitations, and practical use in GUI testing.
Related papers
- Learning to Contextualize Web Pages for Enhanced Decision Making by LLM Agents [89.98593996816186]
We introduce LCoW, a framework for Learning language models to Contextualize complex Web pages into a more comprehensible form.
LCoW decouples web page understanding from decision making by training a separate contextualization module.
We demonstrate that our contextualization module effectively integrates with LLM agents of various scales to significantly enhance their decision-making capabilities.
arXiv Detail & Related papers (2025-03-12T01:33:40Z) - WEPO: Web Element Preference Optimization for LLM-based Web Navigation [3.9400326648635566]
This paper introduces a novel approach to web navigation tasks, called Web Element Preference Optimization (WEPO)
We utilize unsupervised preference learning by sampling distance-based non-salient web elements as negative samples, optimizing maximum likelihood objective within Direct Preference Optimization (DPO)
The results show that our method achieved the state-of-the-art, with an improvement of 13.8% over WebAgent and 5.3% over the visual language model CogAgent baseline.
arXiv Detail & Related papers (2024-12-14T08:25:28Z) - AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents [52.13695464678006]
This study enhances an LLM-based web agent by simply refining its observation and action space.
AgentOccam surpasses the previous state-of-the-art and concurrent work by 9.8 (+29.4%) and 5.9 (+15.8%) absolute points respectively.
arXiv Detail & Related papers (2024-10-17T17:50:38Z) - Enhancing Fault Localization Through Ordered Code Analysis with LLM Agents and Self-Reflection [8.22737389683156]
Large Language Models (LLMs) offer promising improvements in fault localization by enhancing code comprehension and reasoning.
We introduce LLM4FL, a novel LLM-agent-based fault localization approach that integrates SBFL rankings with a divide-and-conquer strategy.
Our results demonstrate that LLM4FL outperforms AutoFL by 19.27% in Top-1 accuracy and surpasses state-of-the-art supervised techniques such as DeepFL and Grace.
arXiv Detail & Related papers (2024-09-20T16:47:34Z) - Show, Don't Tell: Aligning Language Models with Demonstrated Feedback [54.10302745921713]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors.
We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z) - Leveraging Large Language Models for Automated Web-Form-Test Generation: An Empirical Study [7.857895177494495]
Large Language Models (LLMs) have great potential for contextual text generation.
OpenAI's GPT LLMs have been receiving a lot of attention in software testing.
This study investigates the effectiveness of 11 LLMs on 146 web forms from 30 open-source Java web applications.
arXiv Detail & Related papers (2024-05-16T10:21:03Z) - ST-LLM: Large Language Models Are Effective Temporal Learners [58.79456373423189]
Large Language Models (LLMs) have showcased impressive capabilities in text comprehension and generation.
How to effectively encode and understand videos in video-based dialogue systems remains to be solved.
We propose ST-LLM, an effective video-LLM baseline with spatial-temporal sequence modeling inside LLM.
arXiv Detail & Related papers (2024-03-30T10:11:26Z) - "What's important here?": Opportunities and Challenges of Using LLMs in
Retrieving Information from Web Interfaces [19.656406003275713]
We study how large language models (LLMs) can be used to retrieve and locate important elements for a user given query in a web interface.
Our empirical experiments show that while LLMs exhibit a reasonable level of performance in retrieving important UI elements, there is still a substantial room for improvement.
arXiv Detail & Related papers (2023-12-11T06:26:38Z) - Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models [31.509994889286183]
We introduce Language Agent Tree Search (LATS) -- the first general framework that synergizes the capabilities of language models (LMs) in reasoning, acting, and planning.
A key feature of our approach is the incorporation of an environment for external feedback, which offers a more deliberate and adaptive problem-solving mechanism.
LATS achieves state-of-the-art pass@1 accuracy (92.7%) for programming on HumanEval with GPT-4 and demonstrates gradient-free performance (average score of 75.9) comparable to gradient-based fine-tuning for web navigation on WebShop with GPT
arXiv Detail & Related papers (2023-10-06T17:55:11Z) - WebGLM: Towards An Efficient Web-Enhanced Question Answering System with
Human Preferences [32.70333236055738]
WebGLM is a web-enhanced question-answering system based on the General Language Model (GLM)
We develop WebGLM with strategies for the LLM-augmented retriever, bootstrapped generator, and human preference-aware scorer.
arXiv Detail & Related papers (2023-06-13T16:57:53Z) - The Web Can Be Your Oyster for Improving Large Language Models [98.72358969495835]
Large language models (LLMs) encode a large amount of world knowledge.
We consider augmenting LLMs with the large-scale web using search engine.
We present a web-augmented LLM UNIWEB, which is trained over 16 knowledge-intensive tasks in a unified text-to-text format.
arXiv Detail & Related papers (2023-05-18T14:20:32Z) - Can Large Language Models Transform Computational Social Science? [79.62471267510963]
Large Language Models (LLMs) are capable of performing many language processing tasks zero-shot (without training data)
This work provides a road map for using LLMs as Computational Social Science tools.
arXiv Detail & Related papers (2023-04-12T17:33:28Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.