Enhancing Credit Risk Prediction: A Meta-Learning Framework Integrating Baseline Models, LASSO, and ECOC for Superior Accuracy
- URL: http://arxiv.org/abs/2509.22381v1
- Date: Fri, 26 Sep 2025 14:09:04 GMT
- Title: Enhancing Credit Risk Prediction: A Meta-Learning Framework Integrating Baseline Models, LASSO, and ECOC for Superior Accuracy
- Authors: Haibo Wang, Lutfu S. Sua, Jun Huang, Figen Balo, Burak Dolar,
- Abstract summary: This research proposes a comprehensive meta-learning framework that synthesizes multiple complementary models.<n>We implement Permutation Feature Importance analysis for each prediction class across all constituent models.<n>Results demonstrate that our framework significantly enhances the accuracy of financial entity classification.
- Score: 7.254744067646655
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective credit risk management is fundamental to financial decision-making, necessitating robust models for default probability prediction and financial entity classification. Traditional machine learning approaches face significant challenges when confronted with high-dimensional data, limited interpretability, rare event detection, and multi-class imbalance problems in risk assessment. This research proposes a comprehensive meta-learning framework that synthesizes multiple complementary models: supervised learning algorithms, including XGBoost, Random Forest, Support Vector Machine, and Decision Tree; unsupervised methods such as K-Nearest Neighbors; deep learning architectures like Multilayer Perceptron; alongside LASSO regularization for feature selection and dimensionality reduction; and Error-Correcting Output Codes as a meta-classifier for handling imbalanced multi-class problems. We implement Permutation Feature Importance analysis for each prediction class across all constituent models to enhance model transparency. Our framework aims to optimize predictive performance while providing a more holistic approach to credit risk assessment. This research contributes to the development of more accurate and reliable computational models for strategic financial decision support by addressing three fundamental challenges in credit risk modeling. The empirical validation of our approach involves an analysis of the Corporate Credit Ratings dataset with credit ratings for 2,029 publicly listed US companies. Results demonstrate that our meta-learning framework significantly enhances the accuracy of financial entity classification regarding credit rating migrations (upgrades and downgrades) and default probability estimation.
Related papers
- Risk-Aware Financial Forecasting Enhanced by Machine Learning and Intuitionistic Fuzzy Multi-Criteria Decision-Making [7.394315090978424]
The framework fuses structured financial data, unstructured text data, and macroeconomic indicators to enhance predictive accuracy and robustness.<n>It incorporates a hybrid suite of models, including extreme gradient boosting (XGBoost), long short-term memory (LSTM) network, graph neural network (GNN)<n>The empirical results demonstrate high forecasting accuracy, with a net profit mean absolute percentage error (MAPE) of 3.03% and narrow 95% confidence intervals for key financial indicators.
arXiv Detail & Related papers (2025-12-11T04:19:26Z) - Rethinking Evaluation of Infrared Small Target Detection [105.59753496831739]
This paper introduces a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation.<n>An open-source toolkit has be released to facilitate standardized benchmarking.
arXiv Detail & Related papers (2025-09-21T02:45:07Z) - Unveiling Trust in Multimodal Large Language Models: Evaluation, Analysis, and Mitigation [51.19622266249408]
MultiTrust-X is a benchmark for evaluating, analyzing, and mitigating the trustworthiness issues of MLLMs.<n>Based on the taxonomy, MultiTrust-X includes 32 tasks and 28 curated datasets.<n>Our experiments reveal significant vulnerabilities in current models.
arXiv Detail & Related papers (2025-08-21T09:00:01Z) - Explainable Artificial Intelligence Credit Risk Assessment using Machine Learning [0.0]
This paper presents an AI-driven system for Credit Risk Assessment using three state-of-the-art ensemble machine learning models combined with Explainable AI (XAI) techniques.<n>The system leverages XGBoost, LightGBM, and Random Forest algorithms for predictive analysis of loan default risks.<n>LightGBM emerges as the most business-optimal model with the highest accuracy and best trade off between approval and default rates.
arXiv Detail & Related papers (2025-06-24T07:20:05Z) - Interpretable Credit Default Prediction with Ensemble Learning and SHAP [3.948008559977866]
This study focuses on the problem of credit default prediction, builds a modeling framework based on machine learning, and conducts comparative experiments on a variety of mainstream classification algorithms.<n>The results show that the ensemble learning method has obvious advantages in predictive performance, especially in dealing with complex nonlinear relationships between features and data imbalance problems.<n>The external credit score variable plays a dominant role in model decision making, which helps to improve the model's interpretability and practical application value.
arXiv Detail & Related papers (2025-05-27T07:23:22Z) - A Structured Reasoning Framework for Unbalanced Data Classification Using Probabilistic Models [1.6951945839990796]
The paper studies a Markov network model for unbalanced data, aiming to solve the problems of classification bias and insufficient minority class recognition ability.<n>The experimental results show that the Markov network performs well in indicators such as weighted accuracy, F1 score, and AUC-ROC.<n>Future research can focus on efficient model training, structural optimization, and deep learning integration in large-scale unbalanced data environments.
arXiv Detail & Related papers (2025-02-05T17:20:47Z) - Leveraging Convolutional Neural Network-Transformer Synergy for Predictive Modeling in Risk-Based Applications [5.914777314371152]
This paper proposes a deep learning model based on the combination of convolutional neural networks (CNN) and Transformer for credit user default prediction.<n>The results show that the CNN+Transformer model outperforms traditional machine learning models, such as random forests and XGBoost.<n>This study provides a new idea for credit default prediction and provides strong support for risk assessment and intelligent decision-making in the financial field.
arXiv Detail & Related papers (2024-12-24T07:07:14Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.<n>We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.<n>Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - Evaluating Mathematical Reasoning Beyond Accuracy [50.09931172314218]
We introduce ReasonEval, a new methodology for evaluating the quality of reasoning steps.<n>We show that ReasonEval consistently outperforms baseline methods in the meta-evaluation datasets.<n>We observe that ReasonEval can play a significant role in data selection.
arXiv Detail & Related papers (2024-04-08T17:18:04Z) - A machine learning workflow to address credit default prediction [0.44943951389724796]
Credit default prediction (CDP) plays a crucial role in assessing the creditworthiness of individuals and businesses.
We propose a workflow-based approach to improve CDP, which refers to the task of assessing the probability that a borrower will default on his or her credit obligations.
arXiv Detail & Related papers (2024-03-06T15:30:41Z) - InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling [66.3072381478251]
Reward hacking, also termed reward overoptimization, remains a critical challenge.
We propose a framework for reward modeling, namely InfoRM, by introducing a variational information bottleneck objective.
We show that InfoRM's overoptimization detection mechanism is not only effective but also robust across a broad range of datasets.
arXiv Detail & Related papers (2024-02-14T17:49:07Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Uncertainty in Extreme Multi-label Classification [81.14232824864787]
eXtreme Multi-label Classification (XMC) is an essential task in the era of big data for web-scale machine learning applications.
In this paper, we aim to investigate general uncertainty quantification approaches for tree-based XMC models with a probabilistic ensemble-based framework.
In particular, we analyze label-level and instance-level uncertainty in XMC, and propose a general approximation framework based on beam search to efficiently estimate the uncertainty with a theoretical guarantee under long-tail XMC predictions.
arXiv Detail & Related papers (2022-10-18T20:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.