Related papers: Predicting Bank Loan Default with Extreme Gradient Boosting

Predicting Bank Loan Default with Extreme Gradient Boosting

URL: http://arxiv.org/abs/2002.02011v1
Date: Sat, 18 Jan 2020 18:52:10 GMT
Title: Predicting Bank Loan Default with Extreme Gradient Boosting
Authors: Rising Odegua
Abstract summary: We use an Extreme Gradient Boosting algorithm called XGBoost for loan default prediction. The prediction is based on a loan data from a leading bank taking into consideration data sets from both the loan application and the demographic of the applicant.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Loan default prediction is one of the most important and critical problems faced by banks and other financial institutions as it has a huge effect on profit. Although many traditional methods exist for mining information about a loan application, most of these methods seem to be under-performing as there have been reported increases in the number of bad loans. In this paper, we use an Extreme Gradient Boosting algorithm called XGBoost for loan default prediction. The prediction is based on a loan data from a leading bank taking into consideration data sets from both the loan application and the demographic of the applicant. We also present important evaluation metrics such as Accuracy, Recall, precision, F1-Score and ROC area of the analysis. This paper provides an effective basis for loan credit approval in order to identify risky customers from a large number of loan applications using predictive modeling.

Related papers

BankMathBench: A Benchmark for Numerical Reasoning in Banking Scenarios [45.48548225665319]
Large language models (LLMs)-based chatbots are increasingly being adopted in the financial domain.<n>These models still exhibit low accuracy in core banking computations.<n>BankMathBench is a domain-specific dataset that reflects realistic banking tasks.
arXiv Detail & Related papers (2026-02-19T04:27:47Z)
FinAgentBench: A Benchmark Dataset for Agentic Retrieval in Financial Question Answering [57.18367828883773]
FinAgentBench is a benchmark for evaluating agentic retrieval with multi-step reasoning in finance.<n>The benchmark consists of 26K expert-annotated examples on S&P-500 listed firms.<n>We evaluate a suite of state-of-the-art models and demonstrate how targeted fine-tuning can significantly improve agentic retrieval performance.
arXiv Detail & Related papers (2025-08-07T22:15:22Z)
Escaping the Subprime Trap in Algorithmic Lending [49.1574468325115]
We study the role of risk-management constraints, specifically Value-at-Risk (VaR) constraints, in the persistence of segregation in loan approval decisions. We develop a formal model in which a mainstream (low-interest) bank is more sensitive to variance risk than a subprime bank. We show that a small, finite subsidy can help minority groups escape the trap by covering enough of the mainstream bank's downside.
arXiv Detail & Related papers (2025-02-25T03:43:57Z)
Bank Loan Prediction Using Machine Learning Techniques [0.0]
We have worked on a dataset of 148,670 instances and 37 attributes using machine learning methods. The best-performing algorithm was AdaBoosting, which achieved an incredible accuracy of 99.99%.
arXiv Detail & Related papers (2024-10-11T15:01:47Z)
Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models [53.620827459684094]
Large Language Models (LLMs) have great potential for credit scoring tasks, with strong generalization ability across multiple tasks. We propose the first open-source comprehensive framework for exploring LLMs for credit scoring. We then propose the first Credit and Risk Assessment Large Language Model (CALM) by instruction tuning, tailored to the nuanced demands of various financial risk assessment tasks.
arXiv Detail & Related papers (2023-10-01T03:50:34Z)
Inclusive FinTech Lending via Contrastive Learning and Domain Adaptation [9.75150920742607]
FinTech lending has played a significant role in facilitating financial inclusion. There are concerns about the potentially biased algorithmic decision-making during loan screening. We propose a new Transformer-based sequential loan screening model with self-supervised contrastive learning and domain adaptation.
arXiv Detail & Related papers (2023-05-10T01:11:35Z)
Machine Learning Models Evaluation and Feature Importance Analysis on NPL Dataset [0.0]
We evaluate how different Machine learning models perform on the dataset provided by a private bank in Ethiopia. XGBoost achieves the highest F1 score on the KMeans SMOTE over-sampled data.
arXiv Detail & Related papers (2022-08-28T17:09:44Z)
Neural Pseudo-Label Optimism for the Bank Loan Problem [78.66533961716728]
We study a class of classification problems best exemplified by the emphbank loan problem. In the case of linear models, this issue can be addressed by adding optimism directly into the model predictions. We present Pseudo-Label Optimism (PLOT), a conceptually and computationally simple method for this setting applicable to Deep Neural Networks.
arXiv Detail & Related papers (2021-12-03T22:46:31Z)
Feature-Level Fusion of Super-App and Telecommunication Alternative Data Sources for Credit Card Fraud Detection [106.33204064461802]
We review the effectiveness of a feature-level fusion of super-app customer information, mobile phone line data, and traditional credit risk variables for the early detection of identity theft credit card fraud. We evaluate our approach over approximately 90,000 users from a credit lender's digital platform database.
arXiv Detail & Related papers (2021-11-05T19:10:35Z)
Predicting Credit Risk for Unsecured Lending: A Machine Learning Approach [0.0]
This research paper is to build a contemporary credit scoring model to forecast credit defaults for unsecured lending (credit cards) Our research indicates that the Light Gradient Boosting Machine (LGBM) model is better equipped to deliver higher learning speeds, better efficiencies and manage larger data volumes. We expect that deployment of this model will enable better and timely prediction of credit defaults for decision-makers in commercial lending institutions and banks.
arXiv Detail & Related papers (2021-10-05T17:54:56Z)
How Costly is Noise? Data and Disparities in Consumer Credit [0.0]
We show that credit scores are noisier indicators of default risk for historically under-served groups. We find that equalizing the precision of credit scores can reduce disparities in approval rates and in credit misallocation for disadvantaged groups by approximately half.
arXiv Detail & Related papers (2021-05-17T00:42:26Z)
Enhancing User' s Income Estimation with Super-App Alternative Data [59.60094442546867]
It compares the performance of these alternative data sources with the performance of industry-accepted bureau income estimators. Ultimately, this paper shows the incentive for financial institutions to seek to incorporate alternative data into constructing their risk profiles.
arXiv Detail & Related papers (2021-04-12T21:34:44Z)
Explanations of Machine Learning predictions: a mandatory step for its application to Operational Processes [61.20223338508952]
Credit Risk Modelling plays a paramount role. Recent machine and deep learning techniques have been applied to the task. We suggest to use LIME technique to tackle the explainability problem in this field.
arXiv Detail & Related papers (2020-12-30T10:27:59Z)
Super-App Behavioral Patterns in Credit Risk Models: Financial, Statistical and Regulatory Implications [110.54266632357673]
We present the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models. Our results, validated across two countries, show that these new sources of data are particularly useful for predicting financial behavior in low-wealth and young individuals.
arXiv Detail & Related papers (2020-05-09T01:32:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.