Software Testing at the Network Layer: Automated HTTP API Quality Assessment and Security Analysis of Production Web Applications
- URL: http://arxiv.org/abs/2602.08242v4
- Date: Wed, 18 Feb 2026 04:00:37 GMT
- Title: Software Testing at the Network Layer: Automated HTTP API Quality Assessment and Security Analysis of Production Web Applications
- Authors: Ali Hassaan Mughal, Muhammad Bilal, Noor Fatima,
- Abstract summary: We present an automated software testing framework that captures and analyzes the complete HTTP traffic of 18 production websites.<n> minimalist server-rendered sites achieve perfect scores of 100, while content-heavy commercial sites score as low as 56.8.<n>We identify redundant API calls and missing cache headers as the two most pervasive anti-patterns, each affecting 67% of sites.
- Score: 1.9537983097153042
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern web applications rely heavily on client-side API calls to fetch data, render content, and communicate with backend services. However, the quality of these network interactions (redundant requests, missing cache headers, oversized payloads, and excessive third-party dependencies) is rarely tested in a systematic way. Moreover, many of these quality deficiencies carry security implications: missing cache headers enable cache poisoning, excessive third-party dependencies expand the supply-chain attack surface, and error responses risk leaking server internals. In this study, we present an automated software testing framework that captures and analyzes the complete HTTP traffic of 18 production websites spanning 11 categories (e-commerce, news, government, developer tools, travel, and more). Using automated browser instrumentation via Playwright, we record 108 HAR (HTTP Archive) files across 3 independent runs per page, then apply 8 heuristic-based anti-pattern detectors to produce a composite quality score (0-100) for each site. Our results reveal a wide quality spectrum: minimalist server-rendered sites achieve perfect scores of 100, while content-heavy commercial sites score as low as 56.8. We identify redundant API calls and missing cache headers as the two most pervasive anti-patterns, each affecting 67% of sites, while third-party overhead exceeds 20% on 72% of sites. One utility site makes 2,684 requests per page load, which is 447x more than the most minimal site. To protect site reputations, all identities are anonymized using category-based pseudonyms. We provide all analysis scripts, anonymized results, and reproducibility instructions as an open artifact. This work establishes an empirical baseline for HTTP API call quality across the modern web and offers a reproducible testing framework that researchers and practitioners can apply to their own applications.
Related papers
- FullStack-Agent: Enhancing Agentic Full-Stack Web Coding via Development-Oriented Testing and Repository Back-Translation [48.18394873529704]
FullStack-Agent is a unified agent system for full-stack agentic coding.<n>It consists of three parts: FullStack-Dev, a multi-agent framework with strong planning, code editing, navigation, and bug localization abilities.<n>Our FullStack-Dev outperforms the previous state-of-the-art method by 8.7%, 38.2%, and 15.9% on theBench, backend, and database test cases respectively.
arXiv Detail & Related papers (2026-02-03T18:01:34Z) - Characterizing Phishing Pages by JavaScript Capabilities [77.64740286751834]
This paper aims to aid researchers and analysts by automatically differentiating groups of phishing pages based on the underlying kit.<n>For kit detection, our system has an accuracy of 97% on a ground-truth dataset of 548 kit families deployed across 4,562 phishing URLs.<n>We find that UI interactivity and basic fingerprinting are universal techniques, present in 90% and 80% of the clusters.
arXiv Detail & Related papers (2025-09-16T15:39:23Z) - Local Frames: Exploiting Inherited Origins to Bypass Content Blockers [9.01934402761379]
Local frames (i.e., iframes loading content like "about:blank") are mishandled by a wide range of popular Web security and privacy tools.<n>We consider four core capabilities supported by most privacy tools and develop tests to determine whether each can be evaded through the use of local frames.<n>We apply our tests to six popular Web privacy and security tools -- identifying at least one vulnerability in each for a total of 19 -- and extract common patterns regarding their mishandling of local frames.
arXiv Detail & Related papers (2025-05-31T00:07:24Z) - ChatHTTPFuzz: Large Language Model-Assisted IoT HTTP Fuzzing [18.095573835226787]
Internet of Things (IoT) devices offer convenience through web interfaces, web VPNs, and other web-based services, all relying on the HTTP protocol.
Most state-of-the-art tools still rely on random mutation trategies, leading to difficulties in accurately understanding the HTTP protocol's structure and generating many invalid test cases.
We propose a novel LLM-guided IoT HTTP fuzzing method, ChatHTTPFuzz, which automatically parses protocol fields and analyzes service code logic to generate protocol-compliant test cases.
arXiv Detail & Related papers (2024-11-18T10:48:53Z) - Beyond Browsing: API-Based Web Agents [58.39129004543844]
API-Based Agents outperform web Browsing Agents in experiments on WebArena.<n>Hybrid Agents out-perform both others nearly uniformly across tasks.<n>Results strongly suggest that when APIs are available, they present an attractive alternative to relying on web browsing alone.
arXiv Detail & Related papers (2024-10-21T19:46:06Z) - Securing the Web: Analysis of HTTP Security Headers in Popular Global Websites [2.7039386580759666]
Over half of the websites examined (55.66%) received a dismal security grade of 'F'
These low scores expose multiple issues such as weak implementation of Content Security Policies (CSP), neglect of HSTS guidelines, and insufficient application of Subresource Integrity (SRI)
arXiv Detail & Related papers (2024-10-19T01:03:59Z) - Fuzzing Frameworks for Server-side Web Applications: A Survey [3.522950356329991]
This study reviews the state-of-the-art fuzzing frameworks for testing web applications through web API.
We collect papers from seven online repositories of peer-reviewed articles over the last ten years.
arXiv Detail & Related papers (2024-06-05T12:45:02Z) - FV8: A Forced Execution JavaScript Engine for Detecting Evasive Techniques [53.288368877654705]
FV8 is a modified V8 JavaScript engine designed to identify evasion techniques in JavaScript code.
It selectively enforces code execution on APIs that conditionally inject dynamic code.
It identifies 1,443 npm packages and 164 (82%) extensions containing at least one type of evasion.
arXiv Detail & Related papers (2024-05-21T19:54:19Z) - AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation [54.17246674188208]
Web scraping is a powerful technique that extracts data from websites, enabling automated data collection, enhancing data analysis capabilities, and minimizing manual data entry efforts.
Existing methods, wrappers-based methods suffer from limited adaptability and scalability when faced with a new website.
We introduce the paradigm of generating web scrapers with large language models (LLMs) and propose AutoScraper, a two-stage framework that can handle diverse and changing web environments more efficiently.
arXiv Detail & Related papers (2024-04-19T09:59:44Z) - Neural Embeddings for Web Testing [49.66745368789056]
Existing crawlers rely on app-specific, threshold-based, algorithms to assess state equivalence.
We propose WEBEMBED, a novel abstraction function based on neural network embeddings and threshold-free classifiers.
Our evaluation on nine web apps shows that WEBEMBED outperforms state-of-the-art techniques by detecting near-duplicates more accurately.
arXiv Detail & Related papers (2023-06-12T19:59:36Z) - EDEFuzz: A Web API Fuzzer for Excessive Data Exposures [3.5061201620029885]
Excessive Data Exposure (EDE) was the third most significant API vulnerability of 2019.
There are few automated tools -- either in research or industry -- to effectively find and remediate such issues.
We build the first fuzzing tool -- that we call EDEFuzz -- to systematically detect EDEs.
arXiv Detail & Related papers (2023-01-23T04:05:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.