Speculative Ad-hoc Querying
- URL: http://arxiv.org/abs/2503.00714v1
- Date: Sun, 02 Mar 2025 03:44:31 GMT
- Title: Speculative Ad-hoc Querying
- Authors: Haoyu Li, Srikanth Kandula, Maria Angels de Luis Balaguer, Aditya Akella, Venkat Arun,
- Abstract summary: SpeQL predicts likely queries based on the database schema, the user's past queries, and their incomplete query.<n>It continuously displays results for speculated queries and subqueries in real time, aiding exploratory analysis.<n>In the study, SpeQL improves user's query latency by up to $289times$ and kept the overhead reasonable, at $$4$ per hour.
- Score: 12.427441557995484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analyzing large datasets requires responsive query execution, but executing SQL queries on massive datasets can be slow. This paper explores whether query execution can begin even before the user has finished typing, allowing results to appear almost instantly. We propose SpeQL, a system that leverages Large Language Models (LLMs) to predict likely queries based on the database schema, the user's past queries, and their incomplete query. Since exact query prediction is infeasible, SpeQL speculates on partial queries in two ways: 1) it predicts the query structure to compile and plan queries in advance, and 2) it precomputes smaller temporary tables that are much smaller than the original database, but are still predicted to contain all information necessary to answer the user's final query. Additionally, SpeQL continuously displays results for speculated queries and subqueries in real time, aiding exploratory analysis. A utility/user study showed that SpeQL improved task completion time, and participants reported that its speculative display of results helped them discover patterns in the data more quickly. In the study, SpeQL improves user's query latency by up to $289\times$ and kept the overhead reasonable, at $\$4$ per hour.
Related papers
- UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - NL2KQL: From Natural Language to Kusto Query [1.7931930942711818]
NL2KQL is an innovative framework that uses large language models (LLMs) to convert natural language queries (NLQs) to Kusto Query Language (KQL) queries.<n>To validate NL2KQL's performance, we utilize an array of online (based on query execution) and offline (based on query parsing) metrics.
arXiv Detail & Related papers (2024-04-03T01:09:41Z) - Searching for Better Database Queries in the Outputs of Semantic Parsers [16.221439565760058]
In this paper, we consider the case when, at the test time, the system has access to an external criterion that evaluates the generated queries.
The criterion can vary from checking that a query executes without errors to verifying the query on a set of tests.
We apply our approach to the state-of-the-art semantics and report that it allows us to find many queries passing all the tests on different datasets.
arXiv Detail & Related papers (2022-10-13T17:20:45Z) - Context-Aware Query Rewriting for Improving Users' Search Experience on
E-commerce Websites [47.04727122209316]
E-commerce queries are often short and ambiguous.
Users tend to enter multiple searches, which we call context, before purchasing.
We propose an end-to-end context-aware query rewriting model.
arXiv Detail & Related papers (2022-09-15T19:46:01Z) - Graph Enhanced BERT for Query Understanding [55.90334539898102]
query understanding plays a key role in exploring users' search intents and facilitating users to locate their most desired information.
In recent years, pre-trained language models (PLMs) have advanced various natural language processing tasks.
We propose a novel graph-enhanced pre-training framework, GE-BERT, which can leverage both query content and the query graph.
arXiv Detail & Related papers (2022-04-03T16:50:30Z) - Learning Query Expansion over the Nearest Neighbor Graph [94.80212602202518]
Graph Query Expansion (GQE) is presented, which is learned in a supervised manner and performs aggregation over an extended neighborhood of the query.
The technique achieves state-of-the-art results over known benchmarks.
arXiv Detail & Related papers (2021-12-05T19:48:42Z) - SPARQLing Database Queries from Intermediate Question Decompositions [7.475027071883912]
To translate natural language questions into database queries, most approaches rely on a fully annotated training set.
We reduce this burden using grounded in databases intermediate question representations.
Our pipeline consists of two parts: a semantic that converts natural language questions into the intermediate representations and a non-trainable transpiler to the QLSPAR query language.
arXiv Detail & Related papers (2021-09-13T17:57:12Z) - Learning GraphQL Query Costs (Extended Version) [7.899264246319001]
We propose a machine-learning approach to efficiently and accurately estimate the query cost.
Our framework is efficient and predicts query costs with high accuracy, consistently outperforming the static analysis by a large margin.
arXiv Detail & Related papers (2021-08-25T09:18:31Z) - Approximating Aggregated SQL Queries With LSTM Networks [31.528524004435933]
We present a method for query approximation, also known as approximate query processing (AQP)
We use LSTM network to learn the relationship between queries and their results, and to provide a rapid inference layer for predicting query results.
Our method was able to predict up to 120,000 queries in a second, and with a single query latency of no more than 2ms.
arXiv Detail & Related papers (2020-10-25T16:17:58Z) - Query Understanding via Intent Description Generation [75.64800976586771]
We propose a novel Query-to-Intent-Description (Q2ID) task for query understanding.
Unlike existing ranking tasks which leverage the query and its description to compute the relevance of documents, Q2ID is a reverse task which aims to generate a natural language intent description.
We demonstrate the effectiveness of our model by comparing with several state-of-the-art generation models on the Q2ID task.
arXiv Detail & Related papers (2020-08-25T08:56:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.