High-Dimensional Bayesian Optimization with Multi-Task Learning for
RocksDB
- URL: http://arxiv.org/abs/2103.16267v2
- Date: Wed, 31 Mar 2021 20:53:12 GMT
- Title: High-Dimensional Bayesian Optimization with Multi-Task Learning for
RocksDB
- Authors: Sami Alabed, Eiko Yoneki
- Abstract summary: RocksDB is a general-purpose embedded key-value store.
This paper investigates maximizing the throughput of RocksDB IO operations by auto-tuning ten parameters.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: RocksDB is a general-purpose embedded key-value store used in multiple
different settings. Its versatility comes at the cost of complex tuning
configurations. This paper investigates maximizing the throughput of RocksDB IO
operations by auto-tuning ten parameters of varying ranges. Off-the-shelf
optimizers struggle with high-dimensional problem spaces and require a large
number of training samples. We propose two techniques to tackle this problem:
multi-task modeling and dimensionality reduction through a manual grouping of
parameters. By incorporating adjacent optimization in the model, the model
converged faster and found complicated settings that other tuners could not
find. This approach had an additional computational complexity overhead, which
we mitigated by manually assigning parameters to each sub-goal through our
knowledge of RocksDB. The model is then incorporated in a standard Bayesian
Optimization loop to find parameters that maximize RocksDB's IO throughput. Our
method achieved x1.3 improvement when benchmarked against a simulation of
Facebook's social graph traffic, and converged in ten optimization steps
compared to other state-of-the-art methods that required fifty steps.
Related papers
- Tune As You Scale: Hyperparameter Optimization For Compute Efficient
Training [0.0]
We propose a practical method for robustly tuning large models.
CarBS performs local search around the performance-cost frontier.
Among our results, we effectively solve the entire ProcGen benchmark just by tuning a simple baseline.
arXiv Detail & Related papers (2023-06-13T18:22:24Z) - Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges.
Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning.
A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z) - Agent-based Collaborative Random Search for Hyper-parameter Tuning and
Global Function Optimization [0.0]
This paper proposes an agent-based collaborative technique for finding near-optimal values for any arbitrary set of hyper- parameters in a machine learning model.
The behavior of the presented model, specifically against the changes in its design parameters, is investigated in both machine learning and global function optimization applications.
arXiv Detail & Related papers (2023-03-03T21:10:17Z) - VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles.
We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates.
We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z) - Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Hash Layers For Large Sparse Models [48.90784451703753]
We modify the feedforward layer to hash to different sets of weights depending on the current token, over all tokens in the sequence.
We show that this procedure either outperforms or is competitive with learning-to-route mixture-of-expert methods.
arXiv Detail & Related papers (2021-06-08T14:54:24Z) - Surrogate Model Based Hyperparameter Tuning for Deep Learning with SPOT [0.40611352512781856]
This article demonstrates how the architecture-level parameters of deep learning models that were implemented in Keras/tensorflow can be optimized.
The implementation of the tuning procedure is 100 % based on R, the software environment for statistical computing.
arXiv Detail & Related papers (2021-05-30T21:16:51Z) - Scaling Distributed Deep Learning Workloads beyond the Memory Capacity
with KARMA [58.040931661693925]
We propose a strategy that combines redundant recomputing and out-of-core methods.
We achieve an average of 1.52x speedup in six different models over the state-of-the-art out-of-core methods.
Our data parallel out-of-core solution can outperform complex hybrid model parallelism in training large models, e.g. Megatron-LM and Turning-NLG.
arXiv Detail & Related papers (2020-08-26T07:24:34Z) - Weighting Is Worth the Wait: Bayesian Optimization with Importance
Sampling [34.67740033646052]
We improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
By learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
arXiv Detail & Related papers (2020-02-23T15:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.