PSD2 Explainable AI Model for Credit Scoring
- URL: http://arxiv.org/abs/2011.10367v3
- Date: Fri, 6 Aug 2021 16:18:04 GMT
- Title: PSD2 Explainable AI Model for Credit Scoring
- Authors: Neus Llop Torrent (1 and 2), Giorgio Visani (2 and 3), Enrico Bagli
(2) ((1) Politecnico di Milano Graduate School of Business, (2) CRIF S.p.A,
(3) University of Bologna School of Informatics and Engineering)
- Abstract summary: The aim of this project is to develop and test advanced analytical methods to improve the prediction accuracy of Credit Risk Models.
The project focuses on applying an explainable machine learning model to bank-related databases.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The aim of this project is to develop and test advanced analytical methods to
improve the prediction accuracy of Credit Risk Models, preserving at the same
time the model interpretability. In particular, the project focuses on applying
an explainable machine learning model to bank-related databases. The input data
were obtained from open data. Over the total proven models, CatBoost has shown
the highest performance. The algorithm implementation produces a GINI of 0.68
after tuning the hyper-parameters. SHAP package is used to provide a global and
local interpretation of the model predictions to formulate a
human-comprehensive approach to understanding the decision-maker algorithm. The
20 most important features are selected using the Shapley values to present a
full human-understandable model that reveals how the attributes of an
individual are related to its model prediction.
Related papers
- Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.
We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.
Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z) - Characterizing Disparity Between Edge Models and High-Accuracy Base Models for Vision Tasks [5.081175754775484]
We introduce XDELTA, a novel explainable AI tool that explains differences between a high-accuracy base model and a computationally efficient but lower-accuracy edge model.
We conduct a comprehensive evaluation to test XDELTA's ability to explain model discrepancies, using over 1.2 million images and 24 models, and assessing real-world deployments with six participants.
arXiv Detail & Related papers (2024-07-13T22:05:58Z) - Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts.
Existing approaches require re-training models on different data subsets, which is computationally intensive.
This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z) - Decomposing and Editing Predictions by Modeling Model Computation [75.37535202884463]
We introduce a task called component modeling.
The goal of component modeling is to decompose an ML model's prediction in terms of its components.
We present COAR, a scalable algorithm for estimating component attributions.
arXiv Detail & Related papers (2024-04-17T16:28:08Z) - OMNIINPUT: A Model-centric Evaluation Framework through Output
Distribution [31.00645110294068]
We propose a model-centric evaluation framework, OmniInput, to evaluate the quality of an AI/ML model's predictions on all possible inputs.
We employ an efficient sampler to obtain representative inputs and the output distribution of the trained model.
Our experiments demonstrate that OmniInput enables a more fine-grained comparison between models.
arXiv Detail & Related papers (2023-12-06T04:53:12Z) - A performance characteristic curve for model evaluation: the application
in information diffusion prediction [3.8711489380602804]
We propose a metric based on information entropy to quantify the randomness in diffusion data, then identify a scaling pattern between the randomness and the prediction accuracy of the model.
Data points in the patterns by different sequence lengths, system sizes, and randomness all collapse into a single curve, capturing a model's inherent capability of making correct predictions.
The validity of the curve is tested by three prediction models in the same family, reaching conclusions in line with existing studies.
arXiv Detail & Related papers (2023-09-18T07:32:57Z) - Precision-Recall Divergence Optimization for Generative Modeling with
GANs and Normalizing Flows [54.050498411883495]
We develop a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows.
We show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the textitPR-divergences.
Our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet.
arXiv Detail & Related papers (2023-05-30T10:07:17Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Evaluating Representations with Readout Model Switching [18.475866691786695]
In this paper, we propose to use the Minimum Description Length (MDL) principle to devise an evaluation metric.
We design a hybrid discrete and continuous-valued model space for the readout models and employ a switching strategy to combine their predictions.
The proposed metric can be efficiently computed with an online method and we present results for pre-trained vision encoders of various architectures.
arXiv Detail & Related papers (2023-02-19T14:08:01Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Energy Predictive Models for Convolutional Neural Networks on Mobile
Platforms [0.0]
Energy use is a key concern when deploying deep learning models on mobile devices.
We build layer-type predictive models for the fully-connected and pooling layers using 12 representative Convolutional NeuralNetworks (ConvNets) on the Jetson TX1 and the Snapdragon 820.
We obtain an accuracy between 76% to 85% and a model complexity of 1 for the overall energy prediction of the test ConvNets across different hardware-software combinations.
arXiv Detail & Related papers (2020-04-10T17:35:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.