Advanced User Credit Risk Prediction Model using LightGBM, XGBoost and Tabnet with SMOTEENN
- URL: http://arxiv.org/abs/2408.03497v1
- Date: Wed, 7 Aug 2024 01:37:10 GMT
- Title: Advanced User Credit Risk Prediction Model using LightGBM, XGBoost and Tabnet with SMOTEENN
- Authors: Chang Yu, Yixin Jin, Qianwen Xing, Ye Zhang, Shaobo Guo, Shuchen Meng,
- Abstract summary: We use a dataset of over 40,000 records provided by a commercial bank as the research object.
Experiments demonstrated that LightGBM combined with PCA and SMOTEENN techniques can assist banks in accurately predicting potential high-quality customers.
- Score: 8.225603728650478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bank credit risk is a significant challenge in modern financial transactions, and the ability to identify qualified credit card holders among a large number of applicants is crucial for the profitability of a bank'sbank's credit card business. In the past, screening applicants'applicants' conditions often required a significant amount of manual labor, which was time-consuming and labor-intensive. Although the accuracy and reliability of previously used ML models have been continuously improving, the pursuit of more reliable and powerful AI intelligent models is undoubtedly the unremitting pursuit by major banks in the financial industry. In this study, we used a dataset of over 40,000 records provided by a commercial bank as the research object. We compared various dimensionality reduction techniques such as PCA and T-SNE for preprocessing high-dimensional datasets and performed in-depth adaptation and tuning of distributed models such as LightGBM and XGBoost, as well as deep models like Tabnet. After a series of research and processing, we obtained excellent research results by combining SMOTEENN with these techniques. The experiments demonstrated that LightGBM combined with PCA and SMOTEENN techniques can assist banks in accurately predicting potential high-quality customers, showing relatively outstanding performance compared to other models.
Related papers
- Enhanced Credit Score Prediction Using Ensemble Deep Learning Model [12.85570952381681]
This paper combines high-performance models like XGBoost and LightGBM, already widely used in modern banking systems, with the powerful TabNet model.
We have developed a potent model capable of accurately determining credit score levels by integrating Random Forest, XGBoost, and TabNet, and through the stacking technique in ensemble modeling.
arXiv Detail & Related papers (2024-09-30T21:56:16Z) - Advanced Payment Security System:XGBoost, LightGBM and SMOTE Integrated [16.906931748453342]
This study explores the application of advanced machine learning models, specifically based on XGBoost and LightGBM.
By selecting highly correlated features, we aimed to strengthen the training process and boost model performance.
Our detailed analyses and comparisons reveal that the combination of SMOTE with XGBoost and LightGBM offers a highly efficient and powerful mechanism for payment security protection.
arXiv Detail & Related papers (2024-06-07T05:56:43Z) - Credit Card Fraud Detection Using Advanced Transformer Model [15.34892016767672]
This study focuses on innovative applications of the latest Transformer models for more robust and precise fraud detection.
We meticulously processed the data sources, balancing the dataset to address the issue of data sparsity significantly.
We conducted performance comparisons with several widely adopted models, including Support Vector Machine (SVM), Random Forest, Neural Network, and Logistic Regression.
arXiv Detail & Related papers (2024-06-06T04:12:57Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - Research on Credit Risk Early Warning Model of Commercial Banks Based on Neural Network Algorithm [12.315852697312195]
This study harnesses advanced neural network techniques, notably the Backpropagation (BP) neural network, to pioneer a novel model for preempting credit risk in commercial banks.
Research findings evinced that this model efficaciously enhances the foresight and precision of credit risk management.
arXiv Detail & Related papers (2024-05-17T13:18:46Z) - Empowering Many, Biasing a Few: Generalist Credit Scoring through Large
Language Models [53.620827459684094]
Large Language Models (LLMs) have great potential for credit scoring tasks, with strong generalization ability across multiple tasks.
We propose the first open-source comprehensive framework for exploring LLMs for credit scoring.
We then propose the first Credit and Risk Assessment Large Language Model (CALM) by instruction tuning, tailored to the nuanced demands of various financial risk assessment tasks.
arXiv Detail & Related papers (2023-10-01T03:50:34Z) - Bayesian Bilinear Neural Network for Predicting the Mid-price Dynamics
in Limit-Order Book Markets [84.90242084523565]
Traditional time-series econometric methods often appear incapable of capturing the true complexity of the multi-level interactions driving the price dynamics.
By adopting a state-of-the-art second-order optimization algorithm, we train a Bayesian bilinear neural network with temporal attention.
By addressing the use of predictive distributions to analyze errors and uncertainties associated with the estimated parameters and model forecasts, we thoroughly compare our Bayesian model with traditional ML alternatives.
arXiv Detail & Related papers (2022-03-07T18:59:54Z) - Predicting Credit Risk for Unsecured Lending: A Machine Learning
Approach [0.0]
This research paper is to build a contemporary credit scoring model to forecast credit defaults for unsecured lending (credit cards)
Our research indicates that the Light Gradient Boosting Machine (LGBM) model is better equipped to deliver higher learning speeds, better efficiencies and manage larger data volumes.
We expect that deployment of this model will enable better and timely prediction of credit defaults for decision-makers in commercial lending institutions and banks.
arXiv Detail & Related papers (2021-10-05T17:54:56Z) - Explanations of Machine Learning predictions: a mandatory step for its
application to Operational Processes [61.20223338508952]
Credit Risk Modelling plays a paramount role.
Recent machine and deep learning techniques have been applied to the task.
We suggest to use LIME technique to tackle the explainability problem in this field.
arXiv Detail & Related papers (2020-12-30T10:27:59Z) - Families In Wild Multimedia: A Multimodal Database for Recognizing
Kinship [63.27052967981546]
We introduce the first publicly available multi-task MM kinship dataset.
To build FIW MM, we developed machinery to automatically collect, annotate, and prepare the data.
Results highlight edge cases to inspire future research with different areas of improvement.
arXiv Detail & Related papers (2020-07-28T22:36:57Z) - Super-App Behavioral Patterns in Credit Risk Models: Financial,
Statistical and Regulatory Implications [110.54266632357673]
We present the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models.
Our results, validated across two countries, show that these new sources of data are particularly useful for predicting financial behavior in low-wealth and young individuals.
arXiv Detail & Related papers (2020-05-09T01:32:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.