Related papers: Finding Logic Bugs in Spatial Database Engines via Affine Equivalent Inputs

Finding Logic Bugs in Spatial Database Engines via Affine Equivalent Inputs

URL: http://arxiv.org/abs/2410.12496v2
Date: Thu, 17 Oct 2024 20:23:09 GMT
Title: Finding Logic Bugs in Spatial Database Engines via Affine Equivalent Inputs
Authors: Wenjing Deng, Qiuyang Mang, Chengyu Zhang, Manuel Rigger,
Abstract summary: Spatial Database Management Systems (SDBMSs) aim to store, manipulate, and retrieve spatial data. The presence of logic bugs in SDBMSs can lead to incorrect results. Detecting logic bugs in SDBMSs is challenging due to the lack of ground truth for identifying incorrect results.
Score: 6.291508085458252
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spatial Database Management Systems (SDBMSs) aim to store, manipulate, and retrieve spatial data. SDBMSs are employed in various modern applications, such as geographic information systems, computer-aided design tools, and location-based services. However, the presence of logic bugs in SDBMSs can lead to incorrect results, substantially undermining the reliability of these applications. Detecting logic bugs in SDBMSs is challenging due to the lack of ground truth for identifying incorrect results. In this paper, we propose an automated geometry-aware generator to generate high-quality SQL statements for SDBMSs and a novel concept named Affine Equivalent Inputs (AEI) to validate the results of SDBMSs. We implemented them as a tool named Spatter (Spatial DBMSs Tester) for finding logic bugs in four popular SDBMSs: PostGIS, DuckDB Spatial, MySQL, and SQL Server. Our testing campaign detected 34 previously unknown and unique bugs in these SDBMS, of which 30 have been confirmed, and 18 have been already fixed. Our testing efforts have been well appreciated by the developers. Experimental results demonstrate that the geometry-aware generator significantly outperforms a naive random-shape generator in detecting unique bugs, and AEI can identify 14 logic bugs in SDBMSs that were overlooked by previous methodologies.

Related papers

Scaling Automated Database System Testing [3.3302293148249125]
We present a vision and a platform to apply test oracles to any database that supports a subset of commonsql features. In this work, we present both a vision and a platform, SQLancer++, to apply test oracles to any database that supports a subset of commonsql features.
arXiv Detail & Related papers (2025-03-27T12:10:36Z)
Parser Knows Best: Testing DBMS with Coverage-Guided Grammar-Rule Traversal [6.300885279363564]
We propose Fuzz, a novel fuzzing framework that automatically extracts grammar rules from built-in syntaxs' built-in definition files forsql generation. Fuzz can generate diverse query statements to saturate the grammar features of the testeds, which grammar features could be missed by previous tools. In our evaluation, Fuzz outperforms all state-of-the-art existing testing tools in terms of bug finding, grammar rule coverage and code coverage.
arXiv Detail & Related papers (2025-03-05T20:50:41Z)
Towards Reliable Vector Database Management Systems: A Software Testing Roadmap for 2030 [7.711904628828539]
Large Language Models (LLMs) and AI-driven applications have propelled Vector Database Management Systems (VDBMSs) into the spotlight as a critical infrastructure component. VDBMS specializes in storing, indexing, and querying dense vector embeddings, enabling advanced LLM capabilities such as retrieval-augmented generation, long-term memory, and caching mechanisms. Unlike traditional databases for optimized structured data, VDBMS face unique testing challenges stemming from the high-dimensional nature of vector data, the fuzzy semantics in vector search, and the need to support dynamic data scaling and hybrid query processing.
arXiv Detail & Related papers (2025-02-28T07:56:37Z)
Top Ten Challenges Towards Agentic Neural Graph Databases [56.92578700681306]
Graph databases (GDBs) like Neo4j and TigerGraph excel at handling interconnected data but lack advanced inference capabilities. This paper introduces Agentic Neural Graph Databases (Agentic NGDBs), which extend NGDBs with three core functionalities.
arXiv Detail & Related papers (2025-01-24T04:06:50Z)
Constant Optimization Driven Database System Testing [6.246028398098516]
Logic bugs are bugs that can cause database management systems (DBMSs) to silently produce incorrect results for given queries. We propose Constant-Optimization-Driven Database Testing (CODDTest) as a novel approach for detecting logic bugs in databases.
arXiv Detail & Related papers (2025-01-20T03:32:55Z)
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation [73.9145653659403]
We show that Generative Error Correction models struggle to generalize beyond the specific types of errors encountered during training. We propose DARAG, a novel approach designed to improve GEC for ASR in in-domain (ID) and OOD scenarios. Our approach is simple, scalable, and both domain- and language-agnostic.
arXiv Detail & Related papers (2024-10-17T04:00:29Z)
BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data [61.936320820180875]
Large language models (LLMs) have become increasingly pivotal across various domains. BabelBench is an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution. Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement.
arXiv Detail & Related papers (2024-10-01T15:11:24Z)
Tool-Assisted Agent on SQL Inspection and Refinement in Real-World Scenarios [28.55596803781757]
Database mismatches are more prevalent in real-world scenarios. We introduce Spider-Mismatch, a new dataset constructed to reflect the condition mismatch problems encountered in real-world scenarios. Our method achieves the highest performance on the averaged results of the Spider and Spider-Realistic datasets in few-shot settings.
arXiv Detail & Related papers (2024-08-30T03:38:37Z)
SQLaser: Detecting DBMS Logic Bugs with Clause-Guided Fuzzing [17.421408394486072]
Database Management Systems (DBMSs) are vital components in modern data-driven systems. Their complexity often leads to logic bugs, which can lead to incorrect query results, data exposure, unauthorized access, etc. Existing detection employs two strategies: rule-based bug detection and coverage-guided fuzzing.
arXiv Detail & Related papers (2024-07-05T06:56:33Z)
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery. Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering. Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z)
Testing Database Engines via Query Plan Guidance [6.789710498230718]
We propose the concept of Query Plan Guidance (QPG) for guiding automated testing towards "interesting" test cases. We apply our method to three mature, widely-used, and diverse database systems-DBite, TiDB, and Cockroach-and found 53 unique, previously unknown bugs.
arXiv Detail & Related papers (2023-12-29T08:09:47Z)
Detecting DBMS Bugs with Context-Sensitive Instantiation and Multi-Plan Execution [11.18715154222032]
This paper aims to solve the two challenges, including how to generate semantically correctsql queries in a test case, and how to propose effective oracles to capture logic bugs. We have implemented a prototype system called Kangaroo and applied three widely used and well-tested semantic codes. The comparison between our system with the state-of-the-art systems shows that our system outperforms them in terms of the number of generated semantically valid queries, the explored code paths during testing, and the detected bugs.
arXiv Detail & Related papers (2023-12-08T10:15:56Z)
A Comparative Study of Text Embedding Models for Semantic Text Similarity in Bug Reports [0.0]
Retrieving similar bug reports from an existing database can help reduce the time and effort required to resolve bugs. We explored several embedding models such as TF-IDF (Baseline), FastText, Gensim, BERT, and ADA. Our study provides insights into the effectiveness of different embedding methods for retrieving similar bug reports and highlights the impact of selecting the appropriate one for this task.
arXiv Detail & Related papers (2023-08-17T21:36:56Z)
Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation. We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z)
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness [115.66421993459663]
Recent studies reveal that text-to- models are vulnerable to task-specific perturbations. We propose a comprehensive robustness benchmark based on Spider to diagnose the model. We conduct a diagnostic study of the state-of-the-art models on the set.
arXiv Detail & Related papers (2023-01-21T03:57:18Z)
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users. We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.