Learning to Optimize LSM-trees: Towards A Reinforcement Learning based
Key-Value Store for Dynamic Workloads
- URL: http://arxiv.org/abs/2308.07013v2
- Date: Sun, 17 Sep 2023 08:55:09 GMT
- Title: Learning to Optimize LSM-trees: Towards A Reinforcement Learning based
Key-Value Store for Dynamic Workloads
- Authors: Dingheng Mo, Fanchao Chen, Siqiang Luo, Caihua Shan
- Abstract summary: We present RusKey, a key-value store with the following new features.
RusKey is a first attempt to orchestrate LSM-tree structures online.
New LSM-tree design, named FLSM-tree, for efficient transition between different compaction policies.
- Score: 16.898360021759487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LSM-trees are widely adopted as the storage backend of key-value stores.
However, optimizing the system performance under dynamic workloads has not been
sufficiently studied or evaluated in previous work. To fill the gap, we present
RusKey, a key-value store with the following new features: (1) RusKey is a
first attempt to orchestrate LSM-tree structures online to enable robust
performance under the context of dynamic workloads; (2) RusKey is the first
study to use Reinforcement Learning (RL) to guide LSM-tree transformations; (3)
RusKey includes a new LSM-tree design, named FLSM-tree, for an efficient
transition between different compaction policies -- the bottleneck of dynamic
key-value stores. We justify the superiority of the new design with theoretical
analysis; (4) RusKey requires no prior workload knowledge for system
adjustment, in contrast to state-of-the-art techniques. Experiments show that
RusKey exhibits strong performance robustness in diverse workloads, achieving
up to 4x better end-to-end performance than the RocksDB system under various
settings.
Related papers
- DobLIX: A Dual-Objective Learned Index for Log-Structured Merge Trees [4.077820670802213]
DobLIX is a dual-objective learned index specifically designed for Log-Structured Merge(LSM) tree-based key-value stores.
We show that DobLIX reduces indexing overhead and improves throughput by 1.19 to 2.21 times compared to state-of-the-art methods within RocksDB.
arXiv Detail & Related papers (2025-02-07T22:48:14Z) - Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learning [59.001091197106085]
Multi-Task Learning (MTL) for Vision Transformer aims at enhancing the model capability by tackling multiple tasks simultaneously.
Most recent works have predominantly focused on designing Mixture-of-Experts (MoE) structures and in tegrating Low-Rank Adaptation (LoRA) to efficiently perform multi-task learning.
We propose a novel approach dubbed Efficient Multi-Task Learning (EMTAL) by transforming a pre-trained Vision Transformer into an efficient multi-task learner.
arXiv Detail & Related papers (2025-01-12T17:41:23Z) - SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models [88.29990536278167]
We introduce SPaR, a self-play framework integrating tree-search self-refinement to yield valid and comparable preference pairs free from distractions.
Our experiments show that a LLaMA3-8B model, trained over three iterations guided by SPaR, surpasses GPT-4-Turbo on the IFEval benchmark without losing general capabilities.
arXiv Detail & Related papers (2024-12-16T09:47:43Z) - CAMAL: Optimizing LSM-trees via Active Learning [21.859547644109085]
We use machine learning to optimize LSM-tree structure, aiming to reduce the cost of processing various read/write operations.
By integrating Camal into a full system RocksDB, the system performance improves by 28% on average and up to 8x compared to a state-of-the-art RocksDB design.
arXiv Detail & Related papers (2024-09-23T15:35:23Z) - LearnedKV: Integrating LSM and Learned Index for Superior Performance on SSD [0.6774462529828165]
We introduce LearnedKV, a novel tiered key-value store that seamlessly integrates a Log-Structured Merge (LSM) tree with a Learned Index.
Our results show that LearnedKV outperforms state-of-the-art solutions by up to 1.32x in read requests and 1.31x in write performance.
arXiv Detail & Related papers (2024-06-27T05:08:09Z) - Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark [166.40879020706151]
This paper proposes a shift towards BP-free, zeroth-order (ZO) optimization as a solution for reducing memory costs during fine-tuning.
Unlike traditional ZO-SGD methods, our work expands the exploration to a wider array of ZO optimization techniques.
Our study unveils previously overlooked optimization principles, highlighting the importance of task alignment, the role of the forward gradient method, and the balance between algorithm complexity and fine-tuning performance.
arXiv Detail & Related papers (2024-02-18T14:08:48Z) - Continual Learning with Dynamic Sparse Training: Exploring Algorithms
for Effective Model Updates [13.983410740333788]
Continual learning (CL) refers to the ability of an intelligent system to sequentially acquire and retain knowledge from a stream of data with as little computational overhead as possible.
Dynamic Sparse Training (DST) is a prominent way to find these sparse networks and isolate them for each task.
This paper is the first empirical study investigating the effect of different DST components under the CL paradigm.
arXiv Detail & Related papers (2023-08-28T18:31:09Z) - Partitioning Distributed Compute Jobs with Reinforcement Learning and
Graph Neural Networks [58.720142291102135]
Large-scale machine learning models are bringing advances to a broad range of fields.
Many of these models are too large to be trained on a single machine, and must be distributed across multiple devices.
We show that maximum parallelisation is sub-optimal in relation to user-critical metrics such as throughput and blocking rate.
arXiv Detail & Related papers (2023-01-31T17:41:07Z) - Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z) - AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures.
We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS.
Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.