Related papers: Database Workload Characterization with Query Plan Encoders

Database Workload Characterization with Query Plan Encoders

URL: http://arxiv.org/abs/2105.12287v1
Date: Wed, 26 May 2021 01:17:27 GMT
Title: Database Workload Characterization with Query Plan Encoders
Authors: Debjyoti Paul, Jie Cao, Feifei Li, Vivek Srikumar
Abstract summary: We propose our query plan encoders that learn essential features and their correlations from query plans. Our pretrained encoders capture the em structural and the em computational performance of queries independently.
Score: 32.941042348628606
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Smart databases are adopting artificial intelligence (AI) technologies to achieve {\em instance optimality}, and in the future, databases will come with prepackaged AI models within their core components. The reason is that every database runs on different workloads, demands specific resources, and settings to achieve optimal performance. It prompts the necessity to understand workloads running in the system along with their features comprehensively, which we dub as workload characterization. To address this workload characterization problem, we propose our query plan encoders that learn essential features and their correlations from query plans. Our pretrained encoders capture the {\em structural} and the {\em computational performance} of queries independently. We show that our pretrained encoders are adaptable to workloads that expedite the transfer learning process. We performed independent assessments of structural encoder and performance encoders with multiple downstream tasks. For the overall evaluation of our query plan encoders, we architect two downstream tasks (i) query latency prediction and (ii) query classification. These tasks show the importance of feature-based workload characterization. We also performed extensive experiments on individual encoders to verify the effectiveness of representation learning and domain adaptability.

Related papers

DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering [7.264718073839472]
Large Language Model (LLM) agents have shown great potential for solving real-world problems and promise to be a solution for tasks automation in industry.<n>We propose DrafterBench for the comprehensive evaluation of LLM agents in the context of technical drawing revision.<n>DrafterBench is an open-source benchmark to rigorously test AI agents' proficiency in interpreting intricate and long-context instructions.
arXiv Detail & Related papers (2025-07-15T17:56:04Z)
Agentic Predictor: Performance Prediction for Agentic Workflows via Multi-View Encoding [56.565200973244146]
Agentic Predictor is a lightweight predictor for efficient agentic workflow evaluation.<n>By learning to approximate task success rates, Agentic Predictor enables fast and accurate selection of optimal agentic workflow configurations.
arXiv Detail & Related papers (2025-05-26T09:46:50Z)
Sibyl: Forecasting Time-Evolving Query Workloads [9.16115447503004]
Database systems often rely on historical query traces to perform workload-based performance tuning. Real production workloads are time-evolving, making historical queries ineffective for optimizing future workloads. We propose SIBYL, an end-to-end machine learning-based framework that accurately forecasts a sequence of future queries.
arXiv Detail & Related papers (2024-01-08T08:11:32Z)
Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model [78.80174696043021]
We propose a novel model called the Entity-Based Relevance Model (EBRM) The decomposition allows us to use a Cross-encoder QE relevance module for high accuracy. We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.
arXiv Detail & Related papers (2023-07-01T15:44:53Z)
A Unified Active Learning Framework for Annotating Graph Data with Application to Software Source Code Performance Prediction [4.572330678291241]
We develop a unified active learning framework specializing in software performance prediction. We investigate the impact of using different levels of information for active and passive learning. Our approach aims to improve the investment in AI models for different software performance predictions.
arXiv Detail & Related papers (2023-04-06T14:00:48Z)
MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers [140.0479479231558]
In this work, we aim to unify a variety of pre-training tasks into a multi-task pre-trained model, namely MASTER. MASTER utilizes a shared-encoder multi-decoder architecture that can construct a representation bottleneck to compress the abundant semantic information across tasks into dense vectors.
arXiv Detail & Related papers (2022-12-15T13:57:07Z)
Learning Action-Effect Dynamics for Hypothetical Vision-Language Reasoning Task [50.72283841720014]
We propose a novel learning strategy that can improve reasoning about the effects of actions. We demonstrate the effectiveness of our proposed approach and discuss its advantages over previous baselines in terms of performance, data efficiency, and generalization capability.
arXiv Detail & Related papers (2022-12-07T05:41:58Z)
Interpretable by Design: Learning Predictors by Composing Interpretable Queries [8.054701719767293]
We argue that machine learning algorithms should be interpretable by design. We minimize the expected number of queries needed for accurate prediction. Experiments on vision and NLP tasks demonstrate the efficacy of our approach.
arXiv Detail & Related papers (2022-07-03T02:40:34Z)
Self-Supervised Visual Representation Learning Using Lightweight Architectures [0.0]
In self-supervised learning, a model is trained to solve a pretext task, using a data set whose annotations are created by a machine. We critically examine the most notable pretext tasks to extract features from image data. We study the performance of various self-supervised techniques keeping all other parameters uniform.
arXiv Detail & Related papers (2021-10-21T14:13:10Z)
Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z)
KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT) All tasks in KILT are grounded in the same snapshot of Wikipedia. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)
How Useful is Self-Supervised Pretraining for Visual Tasks? [133.1984299177874]
We evaluate various self-supervised algorithms across a comprehensive array of synthetic datasets and downstream tasks. Our experiments offer insights into how the utility of self-supervision changes as the number of available labels grows.
arXiv Detail & Related papers (2020-03-31T16:03:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.