Related papers: Evaluating XGBoost for Balanced and Imbalanced Data: Application to Fraud Detection

Evaluating XGBoost for Balanced and Imbalanced Data: Application to Fraud Detection

URL: http://arxiv.org/abs/2303.15218v1
Date: Mon, 27 Mar 2023 13:59:22 GMT
Title: Evaluating XGBoost for Balanced and Imbalanced Data: Application to Fraud Detection
Authors: Gissel Velarde, Anindya Sudhir, Sanjay Deshmane, Anuj Deshmunkh, Khushboo Sharma and Vaibhav Joshi
Abstract summary: This paper evaluates XGboost's performance given different dataset sizes and class distributions. XGBoost has been selected for evaluation, as it stands out in several benchmarks due to its detection performance and speed.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper evaluates XGboost's performance given different dataset sizes and class distributions, from perfectly balanced to highly imbalanced. XGBoost has been selected for evaluation, as it stands out in several benchmarks due to its detection performance and speed. After introducing the problem of fraud detection, the paper reviews evaluation metrics for detection systems or binary classifiers, and illustrates with examples how different metrics work for balanced and imbalanced datasets. Then, it examines the principles of XGBoost. It proposes a pipeline for data preparation and compares a Vanilla XGBoost against a random search-tuned XGBoost. Random search fine-tuning provides consistent improvement for large datasets of 100 thousand samples, not so for medium and small datasets of 10 and 1 thousand samples, respectively. Besides, as expected, XGBoost recognition performance improves as more data is available, and deteriorates detection performance as the datasets become more imbalanced. Tests on distributions with 50, 45, 25, and 5 percent positive samples show that the largest drop in detection performance occurs for the distribution with only 5 percent positive samples. Sampling to balance the training set does not provide consistent improvement. Therefore, future work will include a systematic study of different techniques to deal with data imbalance and evaluating other approaches, including graphs, autoencoders, and generative adversarial methods, to deal with the lack of labels.

Related papers

Performance of Machine Learning Classifiers for Anomaly Detection in Cyber Security Applications [0.1601392577755919]
This work empirically evaluates machine learning models on two imbalanced public datasets. Models tested include eXtreme Gradient Boosting (XGB) and Multi Layer Perceptron (MLP) IterativeImputer results are comparable to mean and median, but not recommended for large datasets due to increased complexity and execution time.
arXiv Detail & Related papers (2025-04-26T02:43:27Z)
Tree Boosting Methods for Balanced andImbalanced Classification and their Robustness Over Time in Risk Assessment [0.10925516251778125]
Tree-based methods such as XGBoost, stand out in several benchmarks due to detection performance and speed. The developed method increases its recognition performance as more data is given for training. It is still significantly superior to the baseline of precision-recall determined by the ratio of positives divided by positives and negatives.
arXiv Detail & Related papers (2025-04-25T07:35:38Z)
Graph Out-of-Distribution Generalization with Controllable Data Augmentation [51.17476258673232]
Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties. Due to the selection bias of training and testing data, distribution deviation is widespread. We propose OOD calibration to measure the distribution deviation of virtual samples.
arXiv Detail & Related papers (2023-08-16T13:10:27Z)
Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data. We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z)
Revisiting Long-tailed Image Classification: Survey and Benchmarks with New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution. Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z)
Labeling-Free Comparison Testing of Deep Learning Models [28.47632100019289]
We propose a labeling-free comparison testing approach to overcome the limitations of labeling effort and sampling randomness. Our approach outperforms the baseline methods by up to 0.74 and 0.53 on Spearman's correlation and Kendall's $tau$, regardless of the dataset and distribution shift.
arXiv Detail & Related papers (2022-04-08T10:55:45Z)
Spread Spurious Attribute: Improving Worst-group Accuracy with Spurious Attribute Estimation [72.92329724600631]
We propose a pseudo-attribute-based algorithm, coined Spread Spurious Attribute, for improving the worst-group accuracy. Our experiments on various benchmark datasets show that our algorithm consistently outperforms the baseline methods. We also demonstrate that the proposed SSA can achieve comparable performances to methods using full (100%) spurious attribute supervision.
arXiv Detail & Related papers (2022-04-05T09:08:30Z)
Using calibrator to improve robustness in Machine Reading Comprehension [18.844528744164876]
We propose a method to improve the robustness by using a calibrator as the post-hoc reranker. Experimental results on adversarial datasets show that our model can achieve performance improvement by more than 10%.
arXiv Detail & Related papers (2022-02-24T02:16:42Z)
Stable Prediction on Graphs with Agnostic Distribution Shift [105.12836224149633]
Graph neural networks (GNNs) have been shown to be effective on various graph tasks with randomly separated training and testing data. In real applications, however, the distribution of training graph might be different from that of the test one. We propose a novel stable prediction framework for GNNs, which permits both locally and globally stable learning and prediction on graphs.
arXiv Detail & Related papers (2021-10-08T02:45:47Z)
Assessing the Quality of the Datasets by Identifying Mislabeled Samples [14.881597737762316]
We propose a novel statistic -- noise score -- as a measure for the quality of each data point to identify mislabeled samples. In our work, we use the representations derived by the inference network of data quality supervised variational autoencoder (AQUAVS) We validate our proposed statistic through experimentation by corrupting MNIST, FashionMNIST, and CIFAR10/100 datasets.
arXiv Detail & Related papers (2021-09-10T17:14:09Z)
Rethinking Sampling Strategies for Unsupervised Person Re-identification [59.47536050785886]
We analyze the reasons for the performance differences between various sampling strategies under the same framework and loss function. Group sampling is proposed, which gathers samples from the same class into groups. Experiments on Market-1501, DukeMTMC-reID and MSMT17 show that group sampling achieves performance comparable to state-of-the-art methods.
arXiv Detail & Related papers (2021-07-07T05:39:58Z)
Identifying Statistical Bias in Dataset Replication [102.92137353938388]
We study a replication of the ImageNet dataset on which models exhibit a significant (11-14%) drop in accuracy. After correcting for the identified statistical bias, only an estimated $3.6% pm 1.5%$ of the original $11.7% pm 1.0%$ accuracy drop remains unaccounted for.
arXiv Detail & Related papers (2020-05-19T17:48:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.