Credit card score prediction using machine learning models: A new
dataset
- URL: http://arxiv.org/abs/2310.02956v2
- Date: Sun, 15 Oct 2023 06:27:58 GMT
- Title: Credit card score prediction using machine learning models: A new
dataset
- Authors: Anas Arram, Masri Ayob, Musatafa Abbas Abbood Albadr, Alaa Sulaiman,
Dheeb Albashish
- Abstract summary: This study investigates the utilization of machine learning (ML) models for credit card default prediction system.
The main goal here is to investigate the best-performing ML model for new proposed credit card scoring dataset.
- Score: 2.099922236065961
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of credit cards has recently increased, creating an essential need
for credit card assessment methods to minimize potential risks. This study
investigates the utilization of machine learning (ML) models for credit card
default prediction system. The main goal here is to investigate the
best-performing ML model for new proposed credit card scoring dataset. This new
dataset includes credit card transaction histories and customer profiles, is
proposed and tested using a variety of machine learning algorithms, including
logistic regression, decision trees, random forests, multi-layer perceptron
(MLP) neural network, XGBoost, and LightGBM. To prepare the data for machine
learning models, we perform data pre-processing, feature extraction, feature
selection, and data balancing techniques. Experimental results demonstrate that
MLP outperforms logistic regression, decision trees, random forests, LightGBM,
and XGBoost in terms of predictive performance in true positive rate, achieving
an impressive area under the curve (AUC) of 86.7% and an accuracy rate of
91.6%, with a recall rate exceeding 80%. These results indicate the superiority
of MLP in predicting the default customers and assessing the potential risks.
Furthermore, they help banks and other financial institutions in predicting
loan defaults at an earlier stage.
Related papers
- Bank Loan Prediction Using Machine Learning Techniques [0.0]
We have worked on a dataset of 148,670 instances and 37 attributes using machine learning methods.
The best-performing algorithm was AdaBoosting, which achieved an incredible accuracy of 99.99%.
arXiv Detail & Related papers (2024-10-11T15:01:47Z) - Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - Improving Fairness in Credit Lending Models using Subgroup Threshold Optimization [0.0]
We introduce a new fairness technique called textitSubgroup Threshold (textitSTO)
STO works by optimizing the classification thresholds for individual subgroups in order to minimize the overall discrimination score between them.
Our experiments on a real-world credit lending dataset show that STO can reduce gender discrimination by over 90%.
arXiv Detail & Related papers (2024-03-15T19:36:56Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Feature Selection with Annealing for Forecasting Financial Time Series [2.44755919161855]
This study provides a comprehensive method for forecasting financial time series based on tactical input output feature mapping techniques using machine learning (ML) models.
Experiments indicate that the FSA algorithm increased the performance of ML models, regardless of problem type.
arXiv Detail & Related papers (2023-03-03T21:33:38Z) - Machine Learning Models Evaluation and Feature Importance Analysis on
NPL Dataset [0.0]
We evaluate how different Machine learning models perform on the dataset provided by a private bank in Ethiopia.
XGBoost achieves the highest F1 score on the KMeans SMOTE over-sampled data.
arXiv Detail & Related papers (2022-08-28T17:09:44Z) - Predicting Credit Risk for Unsecured Lending: A Machine Learning
Approach [0.0]
This research paper is to build a contemporary credit scoring model to forecast credit defaults for unsecured lending (credit cards)
Our research indicates that the Light Gradient Boosting Machine (LGBM) model is better equipped to deliver higher learning speeds, better efficiencies and manage larger data volumes.
We expect that deployment of this model will enable better and timely prediction of credit defaults for decision-makers in commercial lending institutions and banks.
arXiv Detail & Related papers (2021-10-05T17:54:56Z) - Explanations of Machine Learning predictions: a mandatory step for its
application to Operational Processes [61.20223338508952]
Credit Risk Modelling plays a paramount role.
Recent machine and deep learning techniques have been applied to the task.
We suggest to use LIME technique to tackle the explainability problem in this field.
arXiv Detail & Related papers (2020-12-30T10:27:59Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.