Related papers: Node-Level Uncertainty Estimation in LLM-Generated SQL

Node-Level Uncertainty Estimation in LLM-Generated SQL

URL: http://arxiv.org/abs/2511.13984v2
Date: Wed, 19 Nov 2025 20:18:33 GMT
Title: Node-Level Uncertainty Estimation in LLM-Generated SQL
Authors: Hilaf Hasson, Ruocheng Guo,
Abstract summary: We introduce a semantically aware labeling algorithm that assigns node-level correctness without over-penalizing structural containers or alias variation.<n>We represent each node with a rich set of schema-aware and lexical features - capturing identifier validity, alias resolution, type compatibility, ambiguity in scope, and typo signals.<n>We interpret these probabilities as uncertainty, enabling fine-grained diagnostics that pinpoint exactly where a query is likely to be wrong.
Score: 13.436696325103147
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a practical framework for detecting errors in LLM-generated SQL by estimating uncertainty at the level of individual nodes in the query's abstract syntax tree (AST). Our approach proceeds in two stages. First, we introduce a semantically aware labeling algorithm that, given a generated SQL and a gold reference, assigns node-level correctness without over-penalizing structural containers or alias variation. Second, we represent each node with a rich set of schema-aware and lexical features - capturing identifier validity, alias resolution, type compatibility, ambiguity in scope, and typo signals - and train a supervised classifier to predict per-node error probabilities. We interpret these probabilities as calibrated uncertainty, enabling fine-grained diagnostics that pinpoint exactly where a query is likely to be wrong. Across multiple databases and datasets, our method substantially outperforms token log-probabilities: average AUC improves by +27.44% while maintaining robustness under cross-database evaluation. Beyond serving as an accuracy signal, node-level uncertainty supports targeted repair, human-in-the-loop review, and downstream selective execution. Together, these results establish node-centric, semantically grounded uncertainty estimation as a strong and interpretable alternative to aggregate sequence level confidence measures.

Related papers

Disentangling Ambiguity from Instability in Large Language Models: A Clinical Text-to-SQL Case Study [0.3437656066916039]
We propose CLUES, a framework that models Text-to- Language as a two-stage process.<n>It decomposes semantic uncertainty into an ambiguity score and an instability score.<n> CLUES improves failure prediction over state-of-the-art Kernel Entropy matrix.
arXiv Detail & Related papers (2026-02-12T14:46:20Z)
Ensembling LLM-Induced Decision Trees for Explainable and Robust Error Detection [24.742137117129502]
Error detection is important for ensuring data quality.<n>Recent state-of-the-art ED methods leverage the pre-trained knowledge and semantic capability embedded in large language models (LLMs) to directly label whether a cell is erroneous.<n>We propose an LLM-as-an-inducer framework that adopts LLM to induce the decision tree for ED (termed TreeED) and further ensembles multiple such trees for consensus detection (termed ForestED)<n>Our methods are accurate, explainable and robust, achieving an average F1-score improvement of 16.1% over the best baseline.
arXiv Detail & Related papers (2025-12-08T07:40:48Z)
FGC-Comp: Adaptive Neighbor-Grouped Attribute Completion for Graph-based Anomaly Detection [0.0]
FGC-Comp is a lightweight, classifier-agnostic, and deployment-friendly attribute completion module.<n>We partition each node's neighbors into three label-based groups, apply group-specific transforms to the labeled groups, and train end-to-end with a binary classification objective.<n>Experiments on two real-world fraud datasets validate the effectiveness of the approach with negligible computational overhead.
arXiv Detail & Related papers (2025-12-02T12:34:21Z)
Prompt-Matcher: Leveraging Large Models to Reduce Uncertainty in Schema Matching Results [1.13107643869251]
We introduce a new approach based on fine-grained correspondence verification with specific prompt of Large Language Model.<n>Our approach is an iterative loop that consists of three main components: (1) the correspondence selection algorithm, (2) correspondence verification, and (3) the update of probability distribution.<n>We propose a novel $(1-1/e)$-approximation algorithm that significantly outperforms brute algorithm in terms of computational efficiency.
arXiv Detail & Related papers (2024-08-24T16:54:08Z)
Semi-DETR: Semi-Supervised Object Detection with Detection Transformers [105.45018934087076]
We analyze the DETR-based framework on semi-supervised object detection (SSOD) We present Semi-DETR, the first transformer-based end-to-end semi-supervised object detector. Our method outperforms all state-of-the-art methods by clear margins.
arXiv Detail & Related papers (2023-07-16T16:32:14Z)
Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised Person Re-Identification [80.98291772215154]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations. Recent advances accomplish this task by leveraging clustering-based pseudo labels. We propose a Neighbour Consistency guided Pseudo Label Refinement framework.
arXiv Detail & Related papers (2022-11-30T09:39:57Z)
SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN) Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z)
Semi-supervised Contrastive Outlier removal for Pseudo Expectation Maximization (SCOPE) [2.33877878310217]
We present a new approach to suppress confounding errors through a method we describe as Semi-supervised Contrastive Outlier removal for Pseudo Expectation Maximization (SCOPE) Our results show that SCOPE greatly improves semi-supervised classification accuracy over a baseline, and furthermore when combined with consistency regularization achieves the highest reported accuracy for the semi-supervised CIFAR-10 classification task using 250 and 4000 labeled samples.
arXiv Detail & Related papers (2022-06-28T19:32:50Z)
Approximate Conditional Coverage via Neural Model Approximations [0.030458514384586396]
We analyze a data-driven procedure for obtaining empirically reliable approximate conditional coverage. We demonstrate the potential for substantial (and otherwise unknowable) under-coverage with split-conformal alternatives with marginal coverage guarantees.
arXiv Detail & Related papers (2022-05-28T02:59:05Z)
Delving into Probabilistic Uncertainty for Unsupervised Domain Adaptive Person Re-Identification [54.174146346387204]
We propose an approach named probabilistic uncertainty guided progressive label refinery (P$2$LR) for domain adaptive person re-identification. A quantitative criterion is established to measure the uncertainty of pseudo labels and facilitate the network training. Our method outperforms the baseline by 6.5% mAP on the Duke2Market task, while surpassing the state-of-the-art method by 2.5% mAP on the Market2MSMT task.
arXiv Detail & Related papers (2021-12-28T07:40:12Z)
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer. With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices. We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z)
Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels. Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z)
Revisiting One-vs-All Classifiers for Predictive Uncertainty and Out-of-Distribution Detection in Neural Networks [22.34227625637843]
We investigate how the parametrization of the probabilities in discriminative classifiers affects the uncertainty estimates. We show that one-vs-all formulations can improve calibration on image classification tasks.
arXiv Detail & Related papers (2020-07-10T01:55:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.