Fighting Sampling Bias: A Framework for Training and Evaluating Credit Scoring Models
- URL: http://arxiv.org/abs/2407.13009v1
- Date: Wed, 17 Jul 2024 20:59:54 GMT
- Title: Fighting Sampling Bias: A Framework for Training and Evaluating Credit Scoring Models
- Authors: Nikita Kozodoi, Stefan Lessmann, Morteza Alamgir, Luis Moreira-Matias, Konstantinos Papakonstantinou,
- Abstract summary: This paper addresses the adverse effect of sampling bias on model training and evaluation.
We propose bias-aware self-learning and a reject inference framework for scorecard evaluation.
Our results suggest a profit improvement of about eight percent, when using Bayesian evaluation to decide on acceptance rates.
- Score: 2.918530881730374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scoring models support decision-making in financial institutions. Their estimation and evaluation are based on the data of previously accepted applicants with known repayment behavior. This creates sampling bias: the available labeled data offers a partial picture of the distribution of candidate borrowers, which the model is supposed to score. The paper addresses the adverse effect of sampling bias on model training and evaluation. To improve scorecard training, we propose bias-aware self-learning - a reject inference framework that augments the biased training data by inferring labels for selected rejected applications. For scorecard evaluation, we propose a Bayesian framework that extends standard accuracy measures to the biased setting and provides a reliable estimate of future scorecard performance. Extensive experiments on synthetic and real-world data confirm the superiority of our propositions over various benchmarks in predictive performance and profitability. By sensitivity analysis, we also identify boundary conditions affecting their performance. Notably, we leverage real-world data from a randomized controlled trial to assess the novel methodologies on holdout data that represent the true borrower population. Our findings confirm that reject inference is a difficult problem with modest potential to improve scorecard performance. Addressing sampling bias during scorecard evaluation is a much more promising route to improve scoring practices. For example, our results suggest a profit improvement of about eight percent, when using Bayesian evaluation to decide on acceptance rates.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Learning with Imbalanced Noisy Data by Preventing Bias in Sample
Selection [82.43311784594384]
Real-world datasets contain not only noisy labels but also class imbalance.
We propose a simple yet effective method to address noisy labels in imbalanced datasets.
arXiv Detail & Related papers (2024-02-17T10:34:53Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Leveraging Uncertainty Estimates To Improve Classifier Performance [4.4951754159063295]
Binary classification involves predicting the label of an instance based on whether the model score for the positive class exceeds a threshold chosen based on the application requirements.
However, model scores are often not aligned with the true positivity rate.
This is especially true when the training involves a differential sampling across classes or there is distributional drift between train and test settings.
arXiv Detail & Related papers (2023-11-20T12:40:25Z) - Unbiased Decisions Reduce Regret: Adversarial Domain Adaptation for the
Bank Loan Problem [21.43618923706602]
We focus on a class of problems that share a common feature: the true label is only observed when a data point is assigned a positive label by the principal.
We introduce adversarial optimism (AdOpt) to directly address bias in the training set using adversarial domain adaptation.
arXiv Detail & Related papers (2023-08-15T21:35:44Z) - Exploring validation metrics for offline model-based optimisation with
diffusion models [50.404829846182764]
In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle.
While an approximation to the ground oracle can be trained and used in place of it during model validation to measure the mean reward over generated candidates, the evaluation is approximate and vulnerable to adversarial examples.
This is encapsulated under our proposed evaluation framework which is also designed to measure extrapolation.
arXiv Detail & Related papers (2022-11-19T16:57:37Z) - Augmentation by Counterfactual Explanation -- Fixing an Overconfident
Classifier [11.233334009240947]
A highly accurate but overconfident model is ill-suited for deployment in critical applications such as healthcare and autonomous driving.
This paper proposes an application of counterfactual explanations in fixing an over-confident classifier.
arXiv Detail & Related papers (2022-10-21T18:53:16Z) - Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework.
We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z) - Robust Fairness-aware Learning Under Sample Selection Bias [17.09665420515772]
We propose a framework for robust and fair learning under sample selection bias.
We develop two algorithms to handle sample selection bias when test data is both available and unavailable.
arXiv Detail & Related papers (2021-05-24T23:23:36Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - Active Bayesian Assessment for Black-Box Classifiers [20.668691047355072]
We introduce an active Bayesian approach for assessment of classifier performance to satisfy the desiderata of both reliability and label-efficiency.
We first develop inference strategies to quantify uncertainty for common assessment metrics such as accuracy, misclassification cost, and calibration error.
We then propose a general framework for active Bayesian assessment using inferred uncertainty to guide efficient selection of instances for labeling.
arXiv Detail & Related papers (2020-02-16T08:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.