Can Uncertainty Quantification Improve Learned Index Benefit Estimation?
- URL: http://arxiv.org/abs/2410.17748v2
- Date: Tue, 02 Sep 2025 06:14:14 GMT
- Title: Can Uncertainty Quantification Improve Learned Index Benefit Estimation?
- Authors: Tao Yu, Zhaonian Zou, Hao Xiong,
- Abstract summary: Index tuning is crucial for optimizing database performance by selecting optimal indexes based on workload.<n>Traditional methods relying on what-if tools often suffer from inefficiency and inaccuracy.<n>We propose Beauty, the first uncertainty-aware framework that enhances learning-based models with uncertainty quantification.
- Score: 11.25347279227943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Index tuning is crucial for optimizing database performance by selecting optimal indexes based on workload. The key to this process lies in an accurate and efficient benefit estimator. Traditional methods relying on what-if tools often suffer from inefficiency and inaccuracy. In contrast, learning-based models provide a promising alternative but face challenges such as instability, lack of interpretability, and complex management. To overcome these limitations, we adopt a novel approach: quantifying the uncertainty in learning-based models' results, thereby combining the strengths of both traditional and learning-based methods for reliable index tuning. We propose Beauty, the first uncertainty-aware framework that enhances learning-based models with uncertainty quantification and uses what-if tools as a complementary mechanism to improve reliability and reduce management complexity. Specifically, we introduce a novel method that combines AutoEncoder and Monte Carlo Dropout to jointly quantify uncertainty, tailored to the characteristics of benefit estimation tasks. In experiments involving sixteen models, our approach outperformed existing uncertainty quantification methods in the majority of cases. We also conducted index tuning tests on six datasets. By applying the Beauty framework, we eliminated worst-case scenarios and more than tripled the occurrence of best-case scenarios.
Related papers
- Principled Algorithms for Optimizing Generalized Metrics in Binary Classification [53.604375124674796]
We introduce principled algorithms for optimizing generalized metrics, supported by $H$-consistency and finite-sample generalization bounds.<n>Our approach reformulates metric optimization as a generalized cost-sensitive learning problem.<n>We develop new algorithms, METRO, with strong theoretical performance guarantees.
arXiv Detail & Related papers (2025-12-29T01:33:42Z) - On the Optimality of Tracking Fisher Information in Adaptive Testing with Stochastic Binary Responses [3.491999371287298]
We study the problem of estimating a continuous ability parameter from sequential binary responses.<n>We propose a simple algorithm that adaptively selects questions to maximize Fisher information.<n>We prove that this Fisher-tracking strategy achieves optimal performance in both fixed-confidence and fixed-budget regimes.
arXiv Detail & Related papers (2025-10-09T07:10:00Z) - Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling [48.15636223774418]
Large language models (LLMs) frequently hallucinate due to misaligned self-awareness.
Existing approaches mitigate hallucinations via uncertainty estimation or query rejection.
We propose the Explicit Knowledge Boundary Modeling framework to integrate fast and slow reasoning systems.
arXiv Detail & Related papers (2025-03-04T03:16:02Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences [6.067007470552307]
We propose a methodology for finding sequences of machine learning models that are stable across retraining iterations.
We develop a mixed-integer optimization formulation that is guaranteed to recover optimal models.
Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power.
arXiv Detail & Related papers (2024-03-28T22:45:38Z) - On Uncertainty Calibration and Selective Generation in Probabilistic
Neural Summarization: A Benchmark Study [14.041071717005362]
Modern deep models for summarization attains impressive benchmark performance, but they are prone to generating miscalibrated predictive uncertainty.
This means that they assign high confidence to low-quality predictions, leading to compromised reliability and trustworthiness in real-world applications.
Probabilistic deep learning methods are common solutions to the miscalibration problem, but their relative effectiveness in complex autoregressive summarization tasks are not well-understood.
arXiv Detail & Related papers (2023-04-17T23:06:28Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - A Learning-Based Optimal Uncertainty Quantification Method and Its
Application to Ballistic Impact Problems [1.713291434132985]
This paper concerns the optimal (supremum and infimum) uncertainty bounds for systems where the input (or prior) measure is only partially/imperfectly known.
We demonstrate the learning based framework on the uncertainty optimization problem.
We show that the approach can be used to construct maps for the performance certificate and safety in engineering practice.
arXiv Detail & Related papers (2022-12-28T14:30:53Z) - Post-hoc Uncertainty Learning using a Dirichlet Meta-Model [28.522673618527417]
We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities.
Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties.
We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications.
arXiv Detail & Related papers (2022-12-14T17:34:11Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Efficient Ensemble Model Generation for Uncertainty Estimation with
Bayesian Approximation in Segmentation [74.06904875527556]
We propose a generic and efficient segmentation framework to construct ensemble segmentation models.
In the proposed method, ensemble models can be efficiently generated by using the layer selection method.
We also devise a new pixel-wise uncertainty loss, which improves the predictive performance.
arXiv Detail & Related papers (2020-05-21T16:08:38Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.