Automated Database Indexing using Model-free Reinforcement Learning
- URL: http://arxiv.org/abs/2007.14244v1
- Date: Sat, 25 Jul 2020 14:36:55 GMT
- Title: Automated Database Indexing using Model-free Reinforcement Learning
- Authors: Gabriel Paludo Licks and Felipe Meneguzzi
- Abstract summary: We develop an architecture to solve the problem of automatically indexing a database by using reinforcement learning to optimize queries by indexing data throughout the lifetime of a database.
In our experimental evaluation, our architecture shows superior performance compared to related work on reinforcement learning and genetic algorithms, maintaining near-optimal index configurations and efficiently scaling to large databases.
- Score: 19.64574177805823
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Configuring databases for efficient querying is a complex task, often carried
out by a database administrator. Solving the problem of building indexes that
truly optimize database access requires a substantial amount of database and
domain knowledge, the lack of which often results in wasted space and memory
for irrelevant indexes, possibly jeopardizing database performance for querying
and certainly degrading performance for updating. We develop an architecture to
solve the problem of automatically indexing a database by using reinforcement
learning to optimize queries by indexing data throughout the lifetime of a
database. In our experimental evaluation, our architecture shows superior
performance compared to related work on reinforcement learning and genetic
algorithms, maintaining near-optimal index configurations and efficiently
scaling to large databases.
Related papers
- LLMIdxAdvis: Resource-Efficient Index Advisor Utilizing Large Language Model [24.579793425796193]
We propose a resource-efficient index advisor that uses large language models (LLMs) without extensive fine-tuning.
LLMs frames index recommendation as a sequence-to-sequence task, taking target workload, storage constraint, and corresponding database environment as input.
Experiments on 3 OLAP and 2 real-world benchmarks reveal that LLMIdxAdvis delivers competitive index recommendation with reduced runtime.
arXiv Detail & Related papers (2025-03-10T22:01:24Z) - Role of Databases in GenAI Applications [0.0]
Generative AI (GenAI) is transforming industries by enabling intelligent content generation, automation, and decision-making.
This paper explores the critical role of databases in GenAI, emphasizing the importance of choosing the right database architecture.
It categorizes database roles into conversational context (key-value/document databases), situational context (relational databases/data lakehouses), and semantic context (vector databases)
arXiv Detail & Related papers (2025-03-05T20:32:21Z) - AnDB: Breaking Boundaries with an AI-Native Database for Universal Semantic Analysis [11.419119182421964]
AnDB is an AI-native database that supports traditional O workloads and AI-driven tasks.
AnDB allows users to perform semantic queries using intuitive-like statements without requiring AI expertise.
AnDB future-proofs data management infrastructure, empowering users to effectively and efficiently harness the full potential of all kinds of data without starting from scratch.
arXiv Detail & Related papers (2025-02-19T15:15:59Z) - Differentially Private Learned Indexes [4.290415158471898]
We address the problem of efficiently answering predicate queries on encrypted databases, those secured by Trusted Execution Environments (TEEs)
A common strategy in modern databases to accelerate predicate queries is the use of indexes, which map attribute values (keys) to their corresponding positions in a sorted data array.
Unfortunately, indexes cannot be directly applied to encrypted databases due to strong data dependent leakages.
We propose leveraging learned indexes, a trending technique that repurposes machine learning models as indexing structures, to build more compact DP indexes.
arXiv Detail & Related papers (2024-10-28T16:04:58Z) - Relational Database Augmented Large Language Model [59.38841050766026]
Large language models (LLMs) excel in many natural language processing (NLP) tasks.
They can only incorporate new knowledge through training or supervised fine-tuning processes.
This precise, up-to-date, and private information is typically stored in relational databases.
arXiv Detail & Related papers (2024-07-21T06:19:10Z) - RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL [48.516004807486745]
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to- task.
We propose RB-, a novel retrieval-based framework for in-context prompt engineering.
Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.
arXiv Detail & Related papers (2024-07-11T08:19:58Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - KnobTree: Intelligent Database Parameter Configuration via Explainable Reinforcement Learning [9.94061240360141]
This paper proposes KnobTree, an interpertable framework designed for the optimization of database parameter configuration.
Experiments conducted on Knob and Gbase8s databases have verified exceptional transparency and interpretability of the model.
Our approach also slightly outperforms the existing RL-based tuning algorithms in aspects such as throughput, latency, and processing time.
arXiv Detail & Related papers (2024-06-21T11:40:55Z) - LIST: Learning to Index Spatio-Textual Data for Embedding based Spatial Keyword Queries [53.843367588870585]
List K-kNN spatial keyword queries (TkQs) return a list of objects based on a ranking function that considers both spatial and textual relevance.
There are two key challenges in building an effective and efficient index, i.e., the absence of high-quality labels and the unbalanced results.
We develop a novel pseudolabel generation technique to address the two challenges.
arXiv Detail & Related papers (2024-03-12T05:32:33Z) - LLM As DBA [25.92711955279298]
Large language models (LLMs) have shown great potential to understand valuable documents and generate reasonable answers.
This paper presents a revolutionary LLM-centric framework for database maintenance, including (i) database maintenance knowledge detection from documents and tools, (ii) tree of thought reasoning for root cause analysis, and (iii) collaborative diagnosis among multiple LLMs.
arXiv Detail & Related papers (2023-08-10T10:12:43Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - Semi-Structured Query Grounding for Document-Oriented Databases with
Deep Retrieval and Its Application to Receipt and POI Matching [23.52046767195031]
We aim to address practical challenges when using embedding-based retrieval for the query grounding problem in semi-structured data.
We conduct extensive experiments to find the most effective combination of modules for the embedding and retrieval of both query and database entries.
The proposed model significantly outperforms the conventional manual pattern-based model while requiring much less development and maintenance cost.
arXiv Detail & Related papers (2022-02-23T05:32:34Z) - Baihe: SysML Framework for AI-driven Databases [33.47034563589278]
Using Baihe, an existing relational database system may be retrofitted to use learned components for query optimization or other common tasks.
Baihe's high level architecture is based on the following requirements: separation from the core system, minimal third party dependencies, Robustness, stability and fault tolerance.
arXiv Detail & Related papers (2021-12-29T09:00:07Z) - The Case for Learned Spatial Indexes [62.88514422115702]
We use techniques proposed from a state-of-the art learned multi-dimensional index structure (namely, Flood) to answer spatial range queries.
We show that (i) machine learned search within a partition is faster by 11.79% to 39.51% than binary search when using filtering on one dimension.
We also refine using machine learned indexes is 1.23x to 1.83x times faster than closest competitor which filters on two dimensions.
arXiv Detail & Related papers (2020-08-24T12:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.