Related papers: Improving Credit Card Fraud Detection with an Optimized Explainable Boosting Machine

Improving Credit Card Fraud Detection with an Optimized Explainable Boosting Machine

URL: http://arxiv.org/abs/2602.06955v1
Date: Fri, 06 Feb 2026 18:56:17 GMT
Title: Improving Credit Card Fraud Detection with an Optimized Explainable Boosting Machine
Authors: Reza E. Fazel, Arash Bakhtiary, Siavash A. Bigdeli,
Abstract summary: The study proposes an enhanced workflow based on the Explainable Boosting Machine (EBM)<n>The optimized EBM achieves an effective balance between accuracy and interpretability, enabling precise detection of fraudulent transactions.<n> Experimental evaluation on benchmark credit card data yields an ROC-AUC of 0.983, surpassing prior EBM baselines.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Addressing class imbalance is a central challenge in credit card fraud detection, as it directly impacts predictive reliability in real-world financial systems. To overcome this, the study proposes an enhanced workflow based on the Explainable Boosting Machine (EBM)-a transparent, state-of-the-art implementation of the GA2M algorithm-optimized through systematic hyperparameter tuning, feature selection, and preprocessing refinement. Rather than relying on conventional sampling techniques that may introduce bias or cause information loss, the optimized EBM achieves an effective balance between accuracy and interpretability, enabling precise detection of fraudulent transactions while providing actionable insights into feature importance and interaction effects. Furthermore, the Taguchi method is employed to optimize both the sequence of data scalers and model hyperparameters, ensuring robust, reproducible, and systematically validated performance improvements. Experimental evaluation on benchmark credit card data yields an ROC-AUC of 0.983, surpassing prior EBM baselines (0.975) and outperforming Logistic Regression, Random Forest, XGBoost, and Decision Tree models. These results highlight the potential of interpretable machine learning and data-driven optimization for advancing trustworthy fraud analytics in financial systems.

Related papers

PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering [71.15346406323827]
We introduce PRIME, a benchmark for evaluating verifiers on Process-Outcome Alignment verification.<n>We find that current verifiers frequently fail to detect derivation flaws.<n>We propose a process-aware RLVR training paradigm utilizing verifiers selected via PRIME.
arXiv Detail & Related papers (2026-02-12T04:45:01Z)
Calibrating Agent-Based Financial Markets Simulators with Pretrainable Automatic Posterior Transformation-Based Surrogates [5.002657036975061]
Calibrating Agent-Based Models (ABMs) is an important optimization problem for simulating the complex social systems.<n>The goal is to identify the optimal parameter of a given ABM by minimizing the discrepancy between the simulated data and the real-world observations.<n>Existing methods face two key limitations: 1) surrogating the original evaluation function is hard due the nonlinear yet multi-modal nature of the ABMs, and 2) the commonly used surrogates cannot share the optimization experience among multiple calibration tasks.<n>This work proposes Automatic posterior transformation with Negatively Correlated Search and Adaptive Trust-Region.
arXiv Detail & Related papers (2026-01-11T14:05:26Z)
Principled Algorithms for Optimizing Generalized Metrics in Binary Classification [53.604375124674796]
We introduce principled algorithms for optimizing generalized metrics, supported by $H$-consistency and finite-sample generalization bounds.<n>Our approach reformulates metric optimization as a generalized cost-sensitive learning problem.<n>We develop new algorithms, METRO, with strong theoretical performance guarantees.
arXiv Detail & Related papers (2025-12-29T01:33:42Z)
Pushing the Boundaries of Interpretability: Incremental Enhancements to the Explainable Boosting Machine [1.2461503242570642]
This paper aims at improving the Explainable Boosting Machine (EBM), a state-of-the-art glassbox model that delivers both high accuracy and complete transparency.<n>The work is positioned as a critical step toward developing machine learning systems that are robust, equitable, and transparent.
arXiv Detail & Related papers (2025-11-29T15:46:13Z)
Distributionally Robust Optimization with Adversarial Data Contamination [49.89480853499918]
We focus on optimizing Wasserstein-1 DRO objectives for generalized linear models with convex Lipschitz loss functions.<n>Our primary contribution lies in a novel modeling framework that integrates robustness against training data contamination with robustness against distributional shifts.<n>This work establishes the first rigorous guarantees, supported by efficient computation, for learning under the dual challenges of data contamination and distributional shifts.
arXiv Detail & Related papers (2025-07-14T18:34:10Z)
A Data Balancing and Ensemble Learning Approach for Credit Card Fraud Detection [1.8921747725821432]
This research introduces an innovative method for identifying credit card fraud by combining the SMOTE-KMEANS technique with an ensemble machine learning model.<n>The proposed model was benchmarked against traditional models such as logistic regression, decision trees, random forests, and support vector machines.<n>Results demonstrated that the proposed model achieved superior performance, with an AUC of 0.96 when combined with the SMOTE-KMEANS algorithm.
arXiv Detail & Related papers (2025-03-27T04:59:45Z)
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling [41.19330514054401]
Large language models (LLMs) are prone to hallucination stemming from misaligned self-awareness.<n>We propose the Explicit Knowledge Boundary Modeling framework to integrate fast and slow reasoning systems to harmonize reliability and usability.
arXiv Detail & Related papers (2025-03-04T03:16:02Z)
Integrating Fuzzy Logic into Deep Symbolic Regression [3.0846824529023382]
Credit card fraud detection is a critical concern for financial institutions, intensified by the rise of contactless payment technologies. This paper explores the integration of fuzzy logic into Deep Symbolic Regression to enhance both performance and explainability in fraud detection.
arXiv Detail & Related papers (2024-11-01T07:55:17Z)
Explainable AI for Fraud Detection: An Attention-Based Ensemble of CNNs, GNNs, and A Confidence-Driven Gating Mechanism [5.486205584465161]
This study presents a new stacking-based approach for CCF detection by adding two extra layers to the usual classification process.<n>In the attention layer, we combine soft outputs from a convolutional neural network (CNN) and a recurrent neural network (RNN) using the dependent ordered weighted averaging (DOWA) operator.<n>In the confidence-based layer, we select whichever aggregate (DOWA or IOWA) shows lower uncertainty to feed into a meta-learner.<n>Experiments on three datasets show that our method achieves high accuracy and robust generalization, making it effective for CCF detection.
arXiv Detail & Related papers (2024-10-01T09:56:23Z)
Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations. Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations. We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z)
On the Potential of Network-Based Features for Fraud Detection [3.0846824529023382]
This article explores using the personalised PageRank (PPR) algorithm to capture the social dynamics of fraud. The primary objective is to compare the performance of traditional features with the addition of PPR in fraud detection models. Results indicate that integrating PPR enhances the model's predictive power, surpassing the baseline model.
arXiv Detail & Related papers (2024-02-14T13:20:09Z)
Free Lunch for Generating Effective Outlier Supervision [46.37464572099351]
We propose an ultra-effective method to generate near-realistic outlier supervision. Our proposed textttBayesAug significantly reduces the false positive rate over 12.50% compared with the previous schemes.
arXiv Detail & Related papers (2023-01-17T01:46:45Z)
Model-based Causal Bayesian Optimization [78.120734120667]
We propose model-based causal Bayesian optimization (MCBO) MCBO learns a full system model instead of only modeling intervention-reward pairs. Unlike in standard Bayesian optimization, our acquisition function cannot be evaluated in closed form.
arXiv Detail & Related papers (2022-11-18T14:28:21Z)
Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design. We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.