A Simple and Fast Baseline for Tuning Large XGBoost Models
- URL: http://arxiv.org/abs/2111.06924v1
- Date: Fri, 12 Nov 2021 20:17:50 GMT
- Title: A Simple and Fast Baseline for Tuning Large XGBoost Models
- Authors: Sanyam Kapoor, Valerio Perrone
- Abstract summary: We show that uniform subsampling makes for a simple yet fast baseline to speed up the tuning of large XGBoost models.
We demonstrate the effectiveness of this baseline on large-scale datasets ranging from $15-70mathrmGB$ in size.
- Score: 8.203493207581937
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: XGBoost, a scalable tree boosting algorithm, has proven effective for many
prediction tasks of practical interest, especially using tabular datasets.
Hyperparameter tuning can further improve the predictive performance, but
unlike neural networks, full-batch training of many models on large datasets
can be time consuming. Owing to the discovery that (i) there is a strong linear
relation between dataset size & training time, (ii) XGBoost models satisfy the
ranking hypothesis, and (iii) lower-fidelity models can discover promising
hyperparameter configurations, we show that uniform subsampling makes for a
simple yet fast baseline to speed up the tuning of large XGBoost models using
multi-fidelity hyperparameter optimization with data subsets as the fidelity
dimension. We demonstrate the effectiveness of this baseline on large-scale
tabular datasets ranging from $15-70\mathrm{GB}$ in size.
Related papers
- Scaling Up Diffusion and Flow-based XGBoost Models [5.944645679491607]
We investigate a recent proposal to use XGBoost as the function approximator in diffusion and flow-matching models.
With better implementation it can be scaled to datasets 370x larger than previously used.
We present results on large-scale scientific datasets as part of the Fast Calorimeter Simulation Challenge.
arXiv Detail & Related papers (2024-08-28T18:00:00Z) - Under the Hood of Tabular Data Generation Models: the Strong Impact of Hyperparameter Tuning [2.5168710814072894]
This study addresses the practical need for a unified evaluation of models.
We propose a reduced search space for each model that allows for quick optimization.
For most models, large-scale dataset-specific tuning substantially improves performance compared to the original configurations.
arXiv Detail & Related papers (2024-06-18T07:27:38Z) - Functional Graphical Models: Structure Enables Offline Data-Driven Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization.
We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv Detail & Related papers (2024-01-08T22:33:14Z) - Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How [62.467716468917224]
We propose a methodology that jointly searches for the optimal pretrained model and the hyperparameters for finetuning it.
Our method transfers knowledge about the performance of many pretrained models on a series of datasets.
We empirically demonstrate that our resulting approach can quickly select an accurate pretrained model for a new dataset.
arXiv Detail & Related papers (2023-06-06T16:15:26Z) - Deep incremental learning models for financial temporal tabular datasets
with distribution shifts [0.9790236766474201]
The framework uses a simple basic building block (decision trees) to build self-similar models of any required complexity.
We demonstrate our scheme using XGBoost models trained on the Numerai dataset and show that a two layer deep ensemble of XGBoost models over different model snapshots delivers high quality predictions.
arXiv Detail & Related papers (2023-03-14T14:10:37Z) - Efficient Graph Neural Network Inference at Large Scale [54.89457550773165]
Graph neural networks (GNNs) have demonstrated excellent performance in a wide range of applications.
Existing scalable GNNs leverage linear propagation to preprocess the features and accelerate the training and inference procedure.
We propose a novel adaptive propagation order approach that generates the personalized propagation order for each node based on its topological information.
arXiv Detail & Related papers (2022-11-01T14:38:18Z) - Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Fast, Accurate, and Simple Models for Tabular Data via Augmented
Distillation [97.42894942391575]
We propose FAST-DAD to distill arbitrarily complex ensemble predictors into individual models like boosted trees, random forests, and deep networks.
Our individual distilled models are over 10x faster and more accurate than ensemble predictors produced by AutoML tools like H2O/AutoSklearn.
arXiv Detail & Related papers (2020-06-25T09:57:47Z) - Collegial Ensembles [11.64359837358763]
We show that collegial ensembles can be efficiently implemented in practical architectures using group convolutions and block diagonal layers.
We also show how our framework can be used to analytically derive optimal group convolution modules without having to train a single model.
arXiv Detail & Related papers (2020-06-13T16:40:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.