Related papers: Testing Database Systems with Large Language Model Synthesized Fragments

Testing Database Systems with Large Language Model Synthesized Fragments

URL: http://arxiv.org/abs/2505.02012v1
Date: Sun, 04 May 2025 06:48:01 GMT
Title: Testing Database Systems with Large Language Model Synthesized Fragments
Authors: Suyang Zhong, Manuel Rigger,
Abstract summary: We propose ShQveL, an approach that augments existingsql test-case generators by leveraging Large Language Models (LLMs)<n>We evaluated ShQveL on 5 iterations and discovered 55 unique and previously unknown bugs, 50 of which were promptly fixed after our reports.
Score: 3.3302293148249125
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Various automated testing approaches have been proposed for Database Management Systems (DBMSs). Many such approaches generate pairs of equivalent queries to identify bugs that cause DBMSs to compute incorrect results, and have found hundreds of bugs in mature, widely used DBMSs. Most of these approaches are based on manually written SQL generators; however, their bug-finding capabilities remain constrained by the limited set of SQL features supported by the generators. In this work, we propose ShQveL, an approach that augments existing SQL test-case generators by leveraging Large Language Models (LLMs) to synthesize SQL fragments. Our key idea is to systematically incorporate SQL features gained through automated interactions with LLMs into the SQL generators, increasing the features covered while efficiently generating test cases. Specifically, ShQveL uses SQL sketches -- SQL statements with incomplete code segments that LLMs fill -- to integrate LLM-generated content into the generator. We evaluated ShQveL on 5 DBMSs and discovered 55 unique and previously unknown bugs, 50 of which were promptly fixed after our reports.

Related papers

HI-SQL: Optimizing Text-to-SQL Systems through Dynamic Hint Integration [1.3927943269211591]
Text-to-generation bridges the gap between natural language and databases, enabling users to query data without requiringsql expertise.<n>We propose HI-the, a pipeline that incorporates a novel hint generation mechanism utilizing historical query logs.<n>By analyzing prior queries, our method generates contextual hints that focus on handling the complexities of multi-table and nested operations.<n>Our approach significantly improves query accuracy of LLM-generated queries while ensuring efficiency in terms of calls and latency.
arXiv Detail & Related papers (2025-06-11T12:07:55Z)
Scaling Automated Database System Testing [3.3302293148249125]
We present a vision and a platform to apply test oracles to any database that supports a subset of commonsql features.<n>In this work, we present both a vision and a platform, SQLancer++, to apply test oracles to any database that supports a subset of commonsql features.
arXiv Detail & Related papers (2025-03-27T12:10:36Z)
Evaluating and Enhancing LLMs for Multi-turn Text-to-SQL with Multiple Question Types [11.391598870596392]
Large language models (LLMs) have significantly advanced text-to-speech systems.<n>LLMs often narrowly focus on SQL generation, neglecting the complexities of real-world conversational queries.<n>We propose MM, a test suite designed to evaluate the question classification and SQL generation capabilities of LLMs.
arXiv Detail & Related papers (2024-12-21T10:13:45Z)
Exploring the Use of LLMs for SQL Equivalence Checking [15.42143912008553]
Equivalence checking of twosql queries is an intractable problem.<n>Existing methods can handle only a small subset ofsql, even for bounded equivalence checking.<n>This paper explores whether large language models (LLMs) can also demonstrate the ability to reason withsql queries.
arXiv Detail & Related papers (2024-12-07T06:50:12Z)
Towards Evaluating Large Language Models for Graph Query Generation [49.49881799107061]
Large Language Models (LLMs) are revolutionizing the landscape of Generative Artificial Intelligence (GenAI) This paper presents a comparative study addressing the challenge of generating queries a powerful language for interacting with graph databases using open-access LLMs. Our empirical analysis of query generation accuracy reveals that Claude Sonnet 3.5 outperforms its counterparts in this specific domain.
arXiv Detail & Related papers (2024-11-13T09:11:56Z)
PTD-SQL: Partitioning and Targeted Drilling with LLMs in Text-to-SQL [54.304872649870575]
Large Language Models (LLMs) have emerged as powerful tools for Text-to-sense tasks. In this study, we propose that employing query group partitioning allows LLMs to focus on learning the thought processes specific to a single problem type.
arXiv Detail & Related papers (2024-09-21T09:33:14Z)
SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy [24.919119901664843]
This paper introduces a robust system integrating open-source Large Language Models (LLMs) with a suite of tools to enhance query accuracy and usability. demonstrated by its leading performance on the Spider Leaderboard and deployment by Ant Group.
arXiv Detail & Related papers (2024-07-19T06:01:57Z)
RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL [48.516004807486745]
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to- task. We propose RB-, a novel retrieval-based framework for in-context prompt engineering. Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.
arXiv Detail & Related papers (2024-07-11T08:19:58Z)
MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL [47.120862170230566]
Recent Text-to-yourself methods usually suffer from significant performance degradation on "huge" databases.<n>We introduce MAC, a novel Text-to-yourself LLM-based multi-agent collaborative framework.<n>In our framework, we leverage GPT-4 as the strong backbone for all agent tasks to determine the upper bound of our framework.<n>We then fine-tune an open-sourced instruction-followed model,sql-Llama, by leveraging Code 7B, to accomplish all tasks as GPT-4 does.
arXiv Detail & Related papers (2023-12-18T14:40:20Z)
Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation [76.76046657162306]
Large language models (LLMs) have emerged as a new paradigm for Text-to- task. Large language models (LLMs) have emerged as a new paradigm for Text-to- task.
arXiv Detail & Related papers (2023-08-29T14:59:54Z)
SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs) With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses. With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.