Light Gradient Boosting Machine as a Regression Method for Quantitative
Structure-Activity Relationships
- URL: http://arxiv.org/abs/2105.08626v1
- Date: Wed, 28 Apr 2021 20:19:44 GMT
- Title: Light Gradient Boosting Machine as a Regression Method for Quantitative
Structure-Activity Relationships
- Authors: Robert P. Sheridan, Andy Liaw, Matthew Tudor
- Abstract summary: We compare Light Gradient Boosting Machine (LightGBM) to random forest, single-task deep neural nets, and Extreme Gradient Boosting (XGBoost) on 30 in-house data sets.
LightGBM makes predictions about as accurate as single-task deep neural nets, but is a factor of 1000-fold faster than random forest and 4-fold faster than XGBoost.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the pharmaceutical industry, where it is common to generate many QSAR
models with large numbers of molecules and descriptors, the best QSAR methods
are those that can generate the most accurate predictions but that are also
insensitive to hyperparameters and are computationally efficient. Here we
compare Light Gradient Boosting Machine (LightGBM) to random forest,
single-task deep neural nets, and Extreme Gradient Boosting (XGBoost) on 30
in-house data sets. While any boosting algorithm has many adjustable
hyperparameters, we can define a set of standard hyperparameters at which
LightGBM makes predictions about as accurate as single-task deep neural nets,
but is a factor of 1000-fold faster than random forest and ~4-fold faster than
XGBoost in terms of total computational time for the largest models. Another
very useful feature of LightGBM is that it includes a native method for
estimating prediction intervals.
Related papers
- From Point to probabilistic gradient boosting for claim frequency and severity prediction [1.3812010983144802]
We present a unified notation, and contrast, all the existing point and probabilistic gradient boosting for decision tree algorithms.
We compare their performance on five publicly available datasets for claim frequency and severity, of various size and comprising different number of (high cardinality) categorical variables.
arXiv Detail & Related papers (2024-12-19T14:50:10Z) - Scaling Sparse Fine-Tuning to Large Language Models [67.59697720719672]
Large Language Models (LLMs) are difficult to fully fine-tune due to their sheer number of parameters.
We propose SpIEL, a novel sparse finetuning method which maintains an array of parameter indices and the deltas of these parameters relative to their pretrained values.
We show that SpIEL is superior to popular parameter-efficient fine-tuning methods like LoRA in terms of performance and comparable in terms of run time.
arXiv Detail & Related papers (2024-01-29T18:43:49Z) - Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model [89.8764435351222]
We propose a new family of unbiased estimators called WTA-CRS, for matrix production with reduced variance.
Our work provides both theoretical and experimental evidence that, in the context of tuning transformers, our proposed estimators exhibit lower variance compared to existing ones.
arXiv Detail & Related papers (2023-05-24T15:52:08Z) - Quantile Extreme Gradient Boosting for Uncertainty Quantification [1.7685947618629572]
Extreme Gradient Boosting (XGBoost) is one of the most popular machine learning (ML) methods.
We propose enhancements to XGBoost whereby a modified quantile regression is used as the objective function to estimate uncertainty (QXGBoost)
Our proposed method had comparable or better performance than the uncertainty estimates generated for regular and quantile light gradient boosting.
arXiv Detail & Related papers (2023-04-23T19:46:19Z) - Scalable Gaussian Process Hyperparameter Optimization via Coverage
Regularization [0.0]
We present a novel algorithm which estimates the smoothness and length-scale parameters in the Matern kernel in order to improve robustness of the resulting prediction uncertainties.
We achieve improved UQ over leave-one-out likelihood while maintaining a high degree of scalability as demonstrated in numerical experiments.
arXiv Detail & Related papers (2022-09-22T19:23:37Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Towards Robust and Automatic Hyper-Parameter Tunning [39.04604349338802]
We introduce a new class of HPO method and explore how the low-rank factorization of intermediate layers of a convolutional network can be used to define an analytical response surface.
We quantify how this surface behaves as a surrogate to model performance and can be solved using a trust-region search algorithm, which we call autoHyper.
arXiv Detail & Related papers (2021-11-28T05:27:34Z) - HyP-ABC: A Novel Automated Hyper-Parameter Tuning Algorithm Using
Evolutionary Optimization [1.6114012813668934]
We propose HyP-ABC, an automatic hybrid hyper-parameter optimization algorithm using the modified artificial bee colony approach.
Compared to the state-of-the-art techniques, HyP-ABC is more efficient and has a limited number of parameters to be tuned.
arXiv Detail & Related papers (2021-09-11T16:45:39Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Variance Reduction with Sparse Gradients [82.41780420431205]
Variance reduction methods such as SVRG and SpiderBoost use a mixture of large and small batch gradients.
We introduce a new sparsity operator: The random-top-k operator.
Our algorithm consistently outperforms SpiderBoost on various tasks including image classification, natural language processing, and sparse matrix factorization.
arXiv Detail & Related papers (2020-01-27T08:23:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.