Unbiased Gradient Boosting Decision Tree with Unbiased Feature
Importance
- URL: http://arxiv.org/abs/2305.10696v1
- Date: Thu, 18 May 2023 04:17:46 GMT
- Title: Unbiased Gradient Boosting Decision Tree with Unbiased Feature
Importance
- Authors: Zheyu Zhang, Tianping Zhang, Jian Li
- Abstract summary: Split finding algorithm of Gradient Boosting Decision Tree (GBDT) has been criticized for its bias towards features with a large number of potential splits.
We provide a fine-grained analysis of bias in GBDT and demonstrate that the bias originates from 1) the systematic bias in the gain estimation of each split.
We propose unbiased gain, a new unbiased measurement of gain importance using out-of-bag samples.
- Score: 6.700461065769045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gradient Boosting Decision Tree (GBDT) has achieved remarkable success in a
wide variety of applications. The split finding algorithm, which determines the
tree construction process, is one of the most crucial components of GBDT.
However, the split finding algorithm has long been criticized for its bias
towards features with a large number of potential splits. This bias introduces
severe interpretability and overfitting issues in GBDT. To this end, we provide
a fine-grained analysis of bias in GBDT and demonstrate that the bias
originates from 1) the systematic bias in the gain estimation of each split and
2) the bias in the split finding algorithm resulting from the use of the same
data to evaluate the split improvement and determine the best split. Based on
the analysis, we propose unbiased gain, a new unbiased measurement of gain
importance using out-of-bag samples. Moreover, we incorporate the unbiased
property into the split finding algorithm and develop UnbiasedGBM to solve the
overfitting issue of GBDT. We assess the performance of UnbiasedGBM and
unbiased gain in a large-scale empirical study comprising 60 datasets and show
that: 1) UnbiasedGBM exhibits better performance than popular GBDT
implementations such as LightGBM, XGBoost, and Catboost on average on the 60
datasets and 2) unbiased gain achieves better average performance in feature
selection than popular feature importance methods. The codes are available at
https://github.com/ZheyuAqaZhang/UnbiasedGBM.
Related papers
- Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs)
Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations.
Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z) - General Debiasing for Multimodal Sentiment Analysis [47.05329012210878]
We propose a general debiasing MSA task, which aims to enhance the Out-Of-Distribution (OOD) generalization ability of MSA models.
We employ IPW to reduce the effects of large-biased samples, facilitating robust feature learning for sentiment prediction.
The empirical results demonstrate the superior generalization ability of our proposed framework.
arXiv Detail & Related papers (2023-07-20T00:36:41Z) - SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases [27.56143777363971]
We propose a new debiasing method Sparse Mixture-of-Adapters (SMoA), which can mitigate multiple dataset biases effectively and efficiently.
Experiments on Natural Language Inference and Paraphrase Identification tasks demonstrate that SMoA outperforms full-finetuning, adapter tuning baselines, and prior strong debiasing methods.
arXiv Detail & Related papers (2023-02-28T08:47:20Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Mitigating Representation Bias in Action Recognition: Algorithms and
Benchmarks [76.35271072704384]
Deep learning models perform poorly when applied to videos with rare scenes or objects.
We tackle this problem from two different angles: algorithm and dataset.
We show that the debiased representation can generalize better when transferred to other datasets and tasks.
arXiv Detail & Related papers (2022-09-20T00:30:35Z) - Learning to Split for Automatic Bias Detection [39.353850990332525]
Learning to Split (ls) is an algorithm for automatic bias detection.
We evaluate our approach on Beer Review, CelebA and MNLI.
arXiv Detail & Related papers (2022-04-28T19:41:08Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Feature Importance in Gradient Boosting Trees with Cross-Validation
Feature Selection [11.295032417617454]
We study the effect of biased base learners on Gradient Boosting Machines (GBM) feature importance (FI) measures.
By utilizing cross-validated (CV) unbiased base learners, we fix this flaw at a relatively low computational cost.
We demonstrate the suggested framework in a variety of synthetic and real-world setups, showing a significant improvement in all GBM FI measures while maintaining relatively the same level of prediction accuracy.
arXiv Detail & Related papers (2021-09-12T09:32:43Z) - Greedy Gradient Ensemble for Robust Visual Question Answering [163.65789778416172]
We stress the language bias in Visual Question Answering (VQA) that comes from two aspects, i.e., distribution bias and shortcut bias.
We propose a new de-bias framework, Greedy Gradient Ensemble (GGE), which combines multiple biased models for unbiased base model learning.
GGE forces the biased models to over-fit the biased data distribution in priority, thus makes the base model pay more attention to examples that are hard to solve by biased models.
arXiv Detail & Related papers (2021-07-27T08:02:49Z) - AutoDebias: Learning to Debias for Recommendation [43.84313723394282]
We propose textitAotoDebias that leverages another (small) set of uniform data to optimize the debiasing parameters.
We derive the generalization bound for AutoDebias and prove its ability to acquire the appropriate debiasing strategy.
arXiv Detail & Related papers (2021-05-10T08:03:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.