Related papers: Accelerating String-Key Learned Index Structures via Memoization-based Incremental Training

Accelerating String-Key Learned Index Structures via Memoization-based Incremental Training

URL: http://arxiv.org/abs/2403.11472v1
Date: Mon, 18 Mar 2024 04:44:00 GMT
Title: Accelerating String-Key Learned Index Structures via Memoization-based Incremental Training
Authors: Minsu Kim, Jinwoo Hwang, Guseul Heo, Seiyeon Cho, Divya Mahajan, Jongse Park,
Abstract summary: Learned indexes use machine learning models to learn the mappings between keys and their corresponding positions in key-value indexes. They require frequent retrainings of their models to incorporate the changes introduced by update queries. We develop an algorithm- hardware co-designed string-key learned index system, dubbed SIA.
Score: 16.93830041971135
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learned indexes use machine learning models to learn the mappings between keys and their corresponding positions in key-value indexes. These indexes use the mapping information as training data. Learned indexes require frequent retrainings of their models to incorporate the changes introduced by update queries. To efficiently retrain the models, existing learned index systems often harness a linear algebraic QR factorization technique that performs matrix decomposition. This factorization approach processes all key-position pairs during each retraining, resulting in compute operations that grow linearly with the total number of keys and their lengths. Consequently, the retrainings create a severe performance bottleneck, especially for variable-length string keys, while the retrainings are crucial for maintaining high prediction accuracy and in turn, ensuring low query service latency. To address this performance problem, we develop an algorithm-hardware co-designed string-key learned index system, dubbed SIA. In designing SIA, we leverage a unique algorithmic property of the matrix decomposition-based training method. Exploiting the property, we develop a memoization-based incremental training scheme, which only requires computation over updated keys, while decomposition results of non-updated keys from previous computations can be reused. We further enhance SIA to offload a portion of this training process to an FPGA accelerator to not only relieve CPU resources for serving index queries (i.e., inference), but also accelerate the training itself. Our evaluation shows that compared to ALEX, LIPP, and SIndex, a state-of-the-art learned index systems, SIA-accelerated learned indexes offer 2.6x and 3.4x higher throughput on the two real-world benchmark suites, YCSB and Twitter cache trace, respectively.

Related papers

AutoIndexer: A Reinforcement Learning-Enhanced Index Advisor Towards Scaling Workloads [0.46040036610482665]
AutoIndexer is a framework that combines workload compression, query optimization, and specialized RL models to scale index selection effectively.<n>It substantially lowers search complexity without sacrificing much index quality.<n>On average, it outperforms state-of-the-art RL-based index advisors by approximately 20% in workload cost savings.
arXiv Detail & Related papers (2025-07-30T20:38:13Z)
Learn from the Past: Fast Sparse Indexing for Large Language Model Decoding [7.142158555793151]
Large language models (LLMs) continue to support increasingly longer contexts.<n>Memory demand for key-value caches during decoding grows rapidly.<n>Sparse attention mechanisms alleviate this issue by computing attention weights only for selected key-value pairs.<n>Existing methods often treat each decoding step as an independent process.<n>We propose LFPS, an acceleration method that dynamically constructs sparse indexing candidates based on historical attention patterns.
arXiv Detail & Related papers (2025-05-30T02:35:59Z)
Squeezed Attention: Accelerating Long Context Length LLM Inference [64.11145320159126]
We propose Squeezed Attention as a mechanism to accelerate LLM applications where a large portion of the input prompt is fixed. We use K-means clustering offline to group the keys for the fixed context based on semantic similarity and represent each cluster with a single centroid value. We then compute exact attention using only these important keys from the fixed context, thereby reducing bandwidth and computational costs.
arXiv Detail & Related papers (2024-11-14T18:54:19Z)
Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization [52.16435732772263]
Second-order optimization has been shown to accelerate the training of deep neural networks in many applications. However, generalization properties of second-order methods are still being debated. We show for the first time that exact Gauss-Newton (GN) updates take on a tractable form in a class of deep architectures.
arXiv Detail & Related papers (2024-11-12T17:58:40Z)
UpLIF: An Updatable Self-Tuning Learned Index Framework [4.077820670802213]
UpLIF is an adaptive self-tuning learned index that adjusts the model to accommodate incoming updates. We also introduce the concept of balanced model adjustment, which determines the model's inherent properties.
arXiv Detail & Related papers (2024-08-07T22:30:43Z)
Semi-Parametric Retrieval via Binary Token Index [71.78109794895065]
Semi-parametric Vocabulary Disentangled Retrieval (SVDR) is a novel semi-parametric retrieval framework. It supports two types of indexes: an embedding-based index for high effectiveness, akin to existing neural retrieval methods; and a binary token index that allows for quick and cost-effective setup, resembling traditional term-based retrieval. It achieves a 3% higher top-1 retrieval accuracy compared to the dense retriever DPR when using an embedding-based index and a 9% higher top-1 accuracy compared to BM25 when using a binary token index.
arXiv Detail & Related papers (2024-05-03T08:34:13Z)
Accelerating Matrix Factorization by Dynamic Pruning for Fast Recommendation [0.49399484784577985]
Matrix factorization (MF) is a widely used collaborative filtering algorithm for recommendation systems (RSs) With the dramatically increased number of users/items in current RSs, the computational complexity for training a MF model largely increases. We propose algorithmic methods to accelerate MF, without inducing any additional computational resources.
arXiv Detail & Related papers (2024-03-18T16:27:33Z)
NFL: Robust Learned Index via Distribution Transformation [14.812854942243503]
This paper tackles the approximation problem by applying a textit distribution transformation to the keys before constructing the learned index. A two-stage Normalizing-Flow-based Learned index framework (NFL) is proposed, which first transforms the original complex key distribution into a near-uniform distribution, then builds a learned index leveraging the transformed keys. Based on the characteristics of the transformed keys, we propose a robust After-Flow Learned Index (AFLI)
arXiv Detail & Related papers (2022-05-24T06:03:19Z)
CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm [1.9798034349981157]
We propose a cache-aware learned index (CARMI) design to improve the efficiency of the Recursive Model Index (RMI) framework. We formulate the problem of finding the optimal design of a learned index as an optimization problem and propose a dynamic programming algorithm for solving it. Experiments show that our index construction strategy can construct indexes with significantly better performance compared to baselines.
arXiv Detail & Related papers (2021-03-01T09:20:53Z)
COAX: Correlation-Aware Indexing on Multidimensional Data with Soft Functional Dependencies [3.670422696827525]
We present COAX, a learned index for multidimensional data that learns the correlations between attributes of the dataset. We show experimentally that by predicting correlated attributes in the data, we can improve the query execution time and reduce the memory overhead of the index.
arXiv Detail & Related papers (2020-06-29T21:22:15Z)
AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS) Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z)
Progressively Pretrained Dense Corpus Index for Open-Domain Question Answering [87.32442219333046]
We propose a simple and resource-efficient method to pretrain the paragraph encoder. Our method outperforms an existing dense retrieval method that uses 7 times more computational resources for pretraining.
arXiv Detail & Related papers (2020-04-30T18:09:50Z)
Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information [102.18616819054368]
We propose a post-estimation smoothing operator as a fast and effective method for incorporating structural index data into prediction. Because the smoothing step is separate from the original predictor, it applies to a broad class of machine learning tasks. Our experiments on large scale spatial and temporal datasets highlight the speed and accuracy of post-estimation smoothing in practice.
arXiv Detail & Related papers (2020-03-12T18:04:20Z)
On Coresets for Support Vector Machines [61.928187390362176]
A coreset is a small, representative subset of the original data points. We show that our algorithm can be used to extend the applicability of any off-the-shelf SVM solver to streaming, distributed, and dynamic data settings.
arXiv Detail & Related papers (2020-02-15T23:25:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.