Search-Based Fuzzing For RESTful APIs That Use MongoDB
- URL: http://arxiv.org/abs/2507.20848v1
- Date: Mon, 28 Jul 2025 13:59:39 GMT
- Title: Search-Based Fuzzing For RESTful APIs That Use MongoDB
- Authors: Hernan Ghianni, Man Zhang, Juan P. Galeotti, Andrea Arcuri,
- Abstract summary: In APIs, interactions with a database are a common and crucial aspect.<n>It is essential to consider the database's state (i.e., the data contained in the database) to achieve higher code coverage.<n>This article presents novel techniques to enhance search-based software test generation for APIs interacting with databases.
- Score: 2.2209891813085396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In RESTful APIs, interactions with a database are a common and crucial aspect. When generating whitebox tests, it is essential to consider the database's state (i.e., the data contained in the database) to achieve higher code coverage and uncover more hidden faults. This article presents novel techniques to enhance search-based software test generation for RESTful APIs interacting with NoSQL databases. Specifically, we target the popular MongoDB database, by dynamically analyzing (via automated code instrumentation) the state of the database during the test generation process. Additionally, to achieve better results, our novel approach allows inserting NoSQL data directly from test cases. This is particularly beneficial when generating the correct sequence of events to set the NoSQL database in an appropriate state is challenging or time-consuming. This method is also advantageous for testing read-only microservices. Our novel techniques are implemented as an extension of EvoMaster, the only open-source tool for white-box fuzzing RESTful APIs. Experiments conducted on six RESTful APIs demonstrated significant improvements in code coverage, with increases of up to 18% compared to existing white-box approaches. To better highlight the improvements of our novel techniques, comparisons are also carried out with four state-of-the-art black-box fuzzers.
Related papers
- Bridging the Gap: Enabling Natural Language Queries for NoSQL Databases through Text-to-NoSQL Translation [25.638927795540454]
We introduce the Text-to-No task, which aims to convert natural language queries into accessible queries.<n>To promote research in this area, we released a large-scale and open-source dataset for this task, named TEND (short interfaces for Text-to-No dataset)<n>We also designed a SLM (Small Language Model)-assisted and RAG (Retrieval-augmented Generation)-assisted multi-step framework called SMART, which is specifically designed for Text-to-No conversion.
arXiv Detail & Related papers (2025-02-16T17:01:48Z) - Utilizing API Response for Test Refinement [2.8002188463519944]
This paper proposes a dynamic test refinement approach that leverages the response message.<n>Using an intelligent agent, the approach adds constraints to the API specification that are further used to generate a test scenario.<n>The proposed approach led to a decrease in the number of 4xx responses, taking a step closer to generating more realistic test cases.
arXiv Detail & Related papers (2025-01-30T05:26:32Z) - LlamaRestTest: Effective REST API Testing with Small Language Models [50.058600784556816]
We present LlamaRestTest, a novel approach that employs two custom Large Language Models (LLMs) to generate realistic test inputs.<n>We evaluate it against several state-of-the-art REST API testing tools, including RESTGPT, a GPT-powered specification-enhancement tool.<n>Our study shows that small language models can perform as well as, or better than, large language models in REST API testing.
arXiv Detail & Related papers (2025-01-15T05:51:20Z) - A Multi-Agent Approach for REST API Testing with Semantic Graphs and LLM-Driven Inputs [46.65963514391019]
We present AutoRestTest, the first black-box tool to adopt a dependency-embedded multi-agent approach for REST API testing.<n>Our approach treats REST API testing as a separable problem, where four agents collaborate to optimize API exploration.<n>Our evaluation of AutoRestTest on 12 real-world REST services shows that it outperforms the four leading black-box REST API testing tools.
arXiv Detail & Related papers (2024-11-11T16:20:27Z) - DAC: Decomposed Automation Correction for Text-to-SQL [51.48239006107272]
We introduce De Automation Correction (DAC), which corrects text-to-composed by decomposing entity linking and skeleton parsing.
We show that our method improves performance by $3.7%$ on average of Spider, Bird, and KaggleDBQA compared with the baseline method.
arXiv Detail & Related papers (2024-08-16T14:43:15Z) - Leveraging Large Language Models to Improve REST API Testing [51.284096009803406]
RESTGPT takes as input an API specification, extracts machine-interpretable rules, and generates example parameter values from natural-language descriptions in the specification.
Our evaluations indicate that RESTGPT outperforms existing techniques in both rule extraction and value generation.
arXiv Detail & Related papers (2023-12-01T19:53:23Z) - Advanced White-Box Heuristics for Search-Based Fuzzing of REST APIs [3.3714461095047743]
Currently, EvoMaster is the only existing tool that supports white-box fuzzing of REST APIs.
We provide a series of novel white-box fuzzs, including for example how to deal with under-specified constrains in API schemas.
Our novel techniques are implemented as an extension to our open-source, search-based fuzzer EvoMaster.
arXiv Detail & Related papers (2023-09-15T12:39:01Z) - UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems.
It is composed of publicly available text-to-domain datasets and 29K databases.
Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z) - Can LLM Already Serve as A Database Interface? A BIg Bench for
Large-Scale Database Grounded Text-to-SQLs [89.68522473384522]
We present Bird, a big benchmark for large-scale database grounded in text-to-efficient tasks.
Our emphasis on database values highlights the new challenges of dirty database contents.
Even the most effective text-to-efficient models, i.e. ChatGPT, achieves only 40.08% in execution accuracy.
arXiv Detail & Related papers (2023-05-04T19:02:29Z) - KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers [26.15889661083109]
We present KDBaggleQA, a new cross-domain evaluation dataset of real Web databases.
We show that KDBaggleQA presents a challenge to state-of-the-art zero-shots but that a more realistic evaluation setting and creative use of associated database documentation boosts their accuracy by over 13.2%.
arXiv Detail & Related papers (2021-06-22T00:08:03Z) - Towards a General Framework for ML-based Self-tuning Databases [3.3437858804655383]
State-of-the-art approaches include Bayesian optimization (BO) and reinforcement learning (RL)
We describe our experience when applying these methods to a database not yet studied in this context: FoundationDB.
We show that while BO and RL methods can improve the throughput of FoundationDB by up to 38%, random search is a highly competitive baseline.
arXiv Detail & Related papers (2020-11-16T13:13:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.