Related papers: Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection

Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection

URL: http://arxiv.org/abs/2109.05468v1
Date: Sun, 12 Sep 2021 09:32:43 GMT
Title: Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection
Authors: Afek Ilay Adler and Amichai Painsky
Abstract summary: We study the effect of biased base learners on Gradient Boosting Machines (GBM) feature importance (FI) measures. By utilizing cross-validated (CV) unbiased base learners, we fix this flaw at a relatively low computational cost. We demonstrate the suggested framework in a variety of synthetic and real-world setups, showing a significant improvement in all GBM FI measures while maintaining relatively the same level of prediction accuracy.
Score: 11.295032417617454
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gradient Boosting Machines (GBM) are among the go-to algorithms on tabular data, which produce state of the art results in many prediction tasks. Despite its popularity, the GBM framework suffers from a fundamental flaw in its base learners. Specifically, most implementations utilize decision trees that are typically biased towards categorical variables with large cardinalities. The effect of this bias was extensively studied over the years, mostly in terms of predictive performance. In this work, we extend the scope and study the effect of biased base learners on GBM feature importance (FI) measures. We show that although these implementation demonstrate highly competitive predictive performance, they still, surprisingly, suffer from bias in FI. By utilizing cross-validated (CV) unbiased base learners, we fix this flaw at a relatively low computational cost. We demonstrate the suggested framework in a variety of synthetic and real-world setups, showing a significant improvement in all GBM FI measures while maintaining relatively the same level of prediction accuracy.

Related papers

Federated Class-Incremental Learning with Hierarchical Generative Prototypes [10.532838477096055]
Federated Learning (FL) aims at unburdening the training of deep models by distributing computation across multiple devices (clients) Our proposal constrains both biases in the last layer by efficiently finetuning a pre-trained backbone using learnable prompts. Our method significantly improves the current State Of The Art, providing an average increase of +7.8% in accuracy.
arXiv Detail & Related papers (2024-06-04T16:12:27Z)
GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models [83.30078426829627]
Large language models (LLMs) have gained popularity and are being widely adopted by a large user community. The existing evaluation methods have many constraints, and their results exhibit a limited degree of interpretability. We propose a bias evaluation framework named GPTBIAS that leverages the high performance of LLMs to assess bias in models.
arXiv Detail & Related papers (2023-12-11T12:02:14Z)
Understanding the Detrimental Class-level Effects of Data Augmentation [63.1733767714073]
achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet. We present a framework for understanding how DA interacts with class-level learning dynamics. We show that simple class-conditional augmentation strategies improve performance on the negatively affected classes.
arXiv Detail & Related papers (2023-12-07T18:37:43Z)
Unbiased Gradient Boosting Decision Tree with Unbiased Feature Importance [6.700461065769045]
Split finding algorithm of Gradient Boosting Decision Tree (GBDT) has been criticized for its bias towards features with a large number of potential splits. We provide a fine-grained analysis of bias in GBDT and demonstrate that the bias originates from 1) the systematic bias in the gain estimation of each split. We propose unbiased gain, a new unbiased measurement of gain importance using out-of-bag samples.
arXiv Detail & Related papers (2023-05-18T04:17:46Z)
Variational Boosted Soft Trees [13.956254007901675]
Gradient boosting machines (GBMs) based on decision trees consistently demonstrate state-of-the-art results on regression and classification tasks. We propose to implement Bayesian GBMs using variational inference with soft decision trees.
arXiv Detail & Related papers (2023-02-21T14:51:08Z)
Relieving Long-tailed Instance Segmentation via Pairwise Class Balance [85.53585498649252]
Long-tailed instance segmentation is a challenging task due to the extreme imbalance of training samples among classes. It causes severe biases of the head classes (with majority samples) against the tailed ones. We propose a novel Pairwise Class Balance (PCB) method, built upon a confusion matrix which is updated during training to accumulate the ongoing prediction preferences.
arXiv Detail & Related papers (2022-01-08T07:48:36Z)
General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness. The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.