Neural Embeddings for Web Testing
- URL: http://arxiv.org/abs/2306.07400v1
- Date: Mon, 12 Jun 2023 19:59:36 GMT
- Title: Neural Embeddings for Web Testing
- Authors: Andrea Stocco, Alexandra Willi, Luigi Libero Lucio Starace, Matteo
Biagiola, Paolo Tonella
- Abstract summary: Existing crawlers rely on app-specific, threshold-based, algorithms to assess state equivalence.
We propose WEBEMBED, a novel abstraction function based on neural network embeddings and threshold-free classifiers.
Our evaluation on nine web apps shows that WEBEMBED outperforms state-of-the-art techniques by detecting near-duplicates more accurately.
- Score: 49.66745368789056
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Web test automation techniques employ web crawlers to automatically produce a
web app model that is used for test generation. Existing crawlers rely on
app-specific, threshold-based, algorithms to assess state equivalence. Such
algorithms are hard to tune in the general case and cannot accurately identify
and remove near-duplicate web pages from crawl models. Failing to retrieve an
accurate web app model results in automated test generation solutions that
produce redundant test cases and inadequate test suites that do not cover the
web app functionalities adequately. In this paper, we propose WEBEMBED, a novel
abstraction function based on neural network embeddings and threshold-free
classifiers that can be used to produce accurate web app models during
model-based test generation. Our evaluation on nine web apps shows that
WEBEMBED outperforms state-of-the-art techniques by detecting near-duplicates
more accurately, inferring better web app models that exhibit 22% more
precision, and 24% more recall on average. Consequently, the test suites
generated from these models achieve higher code coverage, with improvements
ranging from 2% to 59% on an app-wise basis and averaging at 23%.
Related papers
- LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content [62.816876067499415]
We propose LiveXiv: a scalable evolving live benchmark based on scientific ArXiv papers.
LiveXiv accesses domain-specific manuscripts at any given timestamp and proposes to automatically generate visual question-answer pairs.
We benchmark multiple open and proprietary Large Multi-modal Models (LMMs) on the first version of our benchmark, showing its challenging nature and exposing the models true abilities.
arXiv Detail & Related papers (2024-10-14T17:51:23Z) - WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents [1.9352015147920767]
We introduce Wilbur, an approach that uses a differentiable ranking model and a novel instruction synthesis technique.
We show that our ranking model can be trained on data from a generative auto-curriculum which samples representative goals.
Wilbur achieves state-of-the-art results on the WebVoyager benchmark, beating text-only models by 8% overall, and up to 36% on certain websites.
arXiv Detail & Related papers (2024-04-08T23:10:47Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Post-training Model Quantization Using GANs for Synthetic Data
Generation [57.40733249681334]
We investigate the use of synthetic data as a substitute for the calibration with real data for the quantization method.
We compare the performance of models quantized using data generated by StyleGAN2-ADA and our pre-trained DiStyleGAN, with quantization using real data and an alternative data generation method based on fractal images.
arXiv Detail & Related papers (2023-05-10T11:10:09Z) - ESAI: Efficient Split Artificial Intelligence via Early Exiting Using
Neural Architecture Search [6.316693022958222]
Deep neural networks have been outperforming conventional machine learning algorithms in many computer vision-related tasks.
The majority of devices are harnessing the cloud computing methodology in which outstanding deep learning models are responsible for analyzing the data on the server.
In this paper, a new framework for deploying on IoT devices has been proposed which can take advantage of both the cloud and the on-device models.
arXiv Detail & Related papers (2021-06-21T04:47:53Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Active Testing: Sample-Efficient Model Evaluation [39.200332879659456]
We introduce active testing: a new framework for sample-efficient model evaluation.
Active testing addresses this by carefully selecting the test points to label.
We show how to remove that bias while reducing the variance of the estimator.
arXiv Detail & Related papers (2021-03-09T10:20:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.