Cortex AISQL: A Production SQL Engine for Unstructured Data
- URL: http://arxiv.org/abs/2511.07663v2
- Date: Wed, 19 Nov 2025 13:22:56 GMT
- Title: Cortex AISQL: A Production SQL Engine for Unstructured Data
- Authors: Paweł Liskowski, Benjamin Han, Paritosh Aggarwal, Bowei Chen, Boxin Jiang, Nitish Jindal, Zihan Li, Aaron Lin, Kyle Schmaus, Jay Tayade, Weicheng Zhao, Anupam Datta, Nathan Wiegand, Dimitris Tsirogiannis,
- Abstract summary: AI is deployed in production at Snowflake, where it powers diverse customer workloads across analytics, search, and content understanding.<n>We show how AI-aware query optimization treats AI inference cost as a first-class optimization objective.<n>Second, adaptive model cascades reduce inference costs by routing most rows through a fast proxy model.<n>Third, semantic join query rewriting lowers the quadratic time complexity of join operations to linear.
- Score: 11.480345698642006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Snowflake's Cortex AISQL is a production SQL engine that integrates native semantic operations directly into SQL. This integration allows users to write declarative queries that combine relational operations with semantic reasoning, enabling them to query both structured and unstructured data effortlessly. However, making semantic operations efficient at production scale poses fundamental challenges. Semantic operations are more expensive than traditional SQL operations, possess distinct latency and throughput characteristics, and their cost and selectivity are unknown during query compilation. Furthermore, existing query engines are not designed to optimize semantic operations. The AISQL query execution engine addresses these challenges through three novel techniques informed by production deployment data from Snowflake customers. First, AI-aware query optimization treats AI inference cost as a first-class optimization objective, reasoning about large language model (LLM) cost directly during query planning to achieve 2-8$\times$ speedups. Second, adaptive model cascades reduce inference costs by routing most rows through a fast proxy model while escalating uncertain cases to a powerful oracle model, achieving 2-6$\times$ speedups while maintaining 90-95% of oracle model quality. Third, semantic join query rewriting lowers the quadratic time complexity of join operations to linear through reformulation as multi-label classification tasks, achieving 15-70$\times$ speedups with often improved prediction quality. AISQL is deployed in production at Snowflake, where it powers diverse customer workloads across analytics, search, and content understanding.
Related papers
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension [69.24689919827817]
modelname is a novel framework based on multi-grained context compression and query-aware information acquisition.<n>modelnameachieves performance superior or comparable to strong baselines.
arXiv Detail & Related papers (2026-03-05T03:16:16Z) - Text-to-SQL as Dual-State Reasoning: Integrating Adaptive Context and Progressive Generation [54.53145282349042]
We introduce DSR-sourced, a textbfDual-textbfS textbfReasoning framework that models Text-to-context as an interaction between an adaptive context state and a progressive generation state.<n>Without any post-training or in-context examples, DSR-sourced achieves competitive performance, reaching 35.28% execution accuracy on Spider 2.0-Snow and 68.32% on BIRD development set.
arXiv Detail & Related papers (2025-11-26T13:52:50Z) - SQLBarber: A System Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads [18.665946271507117]
sqlBarber is a system based on Large Language Models (LLMs) to generate customized and realisticsql workloads.<n>It reduces query generation time by one to three orders of magnitude, and significantly improves alignment with the target cost distribution.<n>We construct and open-source ten benchmarks of varying difficulty levels and target query cost distributions based on real-world statistics from Snowflake and Amazon Redshift.
arXiv Detail & Related papers (2025-07-08T17:20:34Z) - HI-SQL: Optimizing Text-to-SQL Systems through Dynamic Hint Integration [1.3927943269211591]
Text-to-generation bridges the gap between natural language and databases, enabling users to query data without requiringsql expertise.<n>We propose HI-the, a pipeline that incorporates a novel hint generation mechanism utilizing historical query logs.<n>By analyzing prior queries, our method generates contextual hints that focus on handling the complexities of multi-table and nested operations.<n>Our approach significantly improves query accuracy of LLM-generated queries while ensuring efficiency in terms of calls and latency.
arXiv Detail & Related papers (2025-06-11T12:07:55Z) - A Learned Cost Model-based Cross-engine Optimizer for SQL Workloads [3.7960472831772765]
Lakehouse systems enable the same data to be queried with multiple execution engines.<n>We propose a cross-engine that can automate engine selection for diverse queries through a learned cost model.<n>We show that using a query optimized logical plan for cost estimation decreases the average Q-error by even 12.6% over using unoptimized plans as input.
arXiv Detail & Related papers (2025-06-03T12:32:56Z) - Weaver: Interweaving SQL and LLM for Table Reasoning [62.55797244714265]
Weaver generates a flexible, step-by-step plan that combinessql for structured data retrieval with LLMs for semantic processing.<n>Weaver consistently outperforms state-of-the-art methods across four TableQA datasets.
arXiv Detail & Related papers (2025-05-25T03:27:37Z) - Query and Conquer: Execution-Guided SQL Generation [2.07180164747172]
We propose a novel approach for generating complex outputs that significantly improves accuracy in text-to- tasks.<n>Our method leverages execution results to select the most semantically consistent query from multiple candidates.
arXiv Detail & Related papers (2025-03-31T17:43:36Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - JoinGym: An Efficient Query Optimization Environment for Reinforcement
Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost.
We present JoinGym, a query optimization environment for bushy reinforcement learning (RL)
Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z) - Wav2SQL: Direct Generalizable Speech-To-SQL Parsing [55.10009651476589]
Speech-to-Spider (S2Spider) aims to convert spoken questions intosql queries given databases.
We propose the first direct speech-to-speaker parsing model Wav2 which avoids error compounding across cascaded systems.
Experimental results demonstrate that Wav2 avoids error compounding and achieves state-of-the-art results by up to 2.5% accuracy improvement over the baseline.
arXiv Detail & Related papers (2023-05-21T19:26:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.