How Costly is Noise? Data and Disparities in Consumer Credit
- URL: http://arxiv.org/abs/2105.07554v1
- Date: Mon, 17 May 2021 00:42:26 GMT
- Title: How Costly is Noise? Data and Disparities in Consumer Credit
- Authors: Laura Blattner and Scott Nelson
- Abstract summary: We show that credit scores are noisier indicators of default risk for historically under-served groups.
We find that equalizing the precision of credit scores can reduce disparities in approval rates and in credit misallocation for disadvantaged groups by approximately half.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We show that lenders face more uncertainty when assessing default risk of
historically under-served groups in US credit markets and that this information
disparity is a quantitatively important driver of inefficient and unequal
credit market outcomes. We first document that widely used credit scores are
statistically noisier indicators of default risk for historically under-served
groups. This noise emerges primarily through the explanatory power of the
underlying credit report data (e.g., thin credit files), not through issues
with model fit (e.g., the inability to include protected class in the scoring
model). Estimating a structural model of lending with heterogeneity in
information, we quantify the gains from addressing these information
disparities for the US mortgage market. We find that equalizing the precision
of credit scores can reduce disparities in approval rates and in credit
misallocation for disadvantaged groups by approximately half.
Related papers
- Debiasing Alternative Data for Credit Underwriting Using Causal Inference [0.0]
Alternative data provides valuable insights for lenders to evaluate a borrower's creditworthiness.
But some forms of alternative data have historically been excluded from credit underwriting because it could act as an illegal proxy for a protected class.
We propose a method for applying causal inference to a supervised machine learning model to debias alternative data so that it might be used for credit underwriting.
arXiv Detail & Related papers (2024-10-29T12:54:55Z) - Credit Scores: Performance and Equity [0.0]
We benchmark a widely used credit score against a machine learning model of consumer default.
We find significant misclassification of borrowers, especially those with low scores.
Our model improves predictive accuracy for young, low-income, and minority groups.
arXiv Detail & Related papers (2024-08-30T23:36:02Z) - The Impact of Differential Feature Under-reporting on Algorithmic Fairness [86.275300739926]
We present an analytically tractable model of differential feature under-reporting.
We then use to characterize the impact of this kind of data bias on algorithmic fairness.
Our results show that, in real world data settings, under-reporting typically leads to increasing disparities.
arXiv Detail & Related papers (2024-01-16T19:16:22Z) - Empowering Many, Biasing a Few: Generalist Credit Scoring through Large
Language Models [53.620827459684094]
Large Language Models (LLMs) have great potential for credit scoring tasks, with strong generalization ability across multiple tasks.
We propose the first open-source comprehensive framework for exploring LLMs for credit scoring.
We then propose the first Credit and Risk Assessment Large Language Model (CALM) by instruction tuning, tailored to the nuanced demands of various financial risk assessment tasks.
arXiv Detail & Related papers (2023-10-01T03:50:34Z) - On the dynamics of credit history and social interaction features, and
their impact on creditworthiness assessment performance [3.6748639131154315]
This study aims to understand the creditworthiness assessment performance dynamics and how it is influenced by the credit history, repayment behavior, and social network features.
Our research shows that borrowers' history increases performance at a decreasing rate during the first six months and then stabilizes.
The most notable effect on perfomance of social networks features occurs at loan application.
arXiv Detail & Related papers (2022-04-13T00:42:27Z) - Feature-Level Fusion of Super-App and Telecommunication Alternative Data
Sources for Credit Card Fraud Detection [106.33204064461802]
We review the effectiveness of a feature-level fusion of super-app customer information, mobile phone line data, and traditional credit risk variables for the early detection of identity theft credit card fraud.
We evaluate our approach over approximately 90,000 users from a credit lender's digital platform database.
arXiv Detail & Related papers (2021-11-05T19:10:35Z) - Explanations of Machine Learning predictions: a mandatory step for its
application to Operational Processes [61.20223338508952]
Credit Risk Modelling plays a paramount role.
Recent machine and deep learning techniques have been applied to the task.
We suggest to use LIME technique to tackle the explainability problem in this field.
arXiv Detail & Related papers (2020-12-30T10:27:59Z) - A Novel Classification Approach for Credit Scoring based on Gaussian
Mixture Models [0.0]
This paper introduces a new method for credit scoring based on Gaussian Mixture Models.
Our algorithm classifies consumers into groups which are labeled as positive or negative.
We apply our model with real world databases from Australia, Japan, and Germany.
arXiv Detail & Related papers (2020-10-26T07:34:27Z) - PCAL: A Privacy-preserving Intelligent Credit Risk Modeling Framework
Based on Adversarial Learning [111.19576084222345]
This paper proposes a framework of Privacy-preserving Credit risk modeling based on Adversarial Learning (PCAL)
PCAL aims to mask the private information inside the original dataset, while maintaining the important utility information for the target prediction task performance.
Results indicate that PCAL can learn an effective, privacy-free representation from user data, providing a solid foundation towards privacy-preserving machine learning for credit risk analysis.
arXiv Detail & Related papers (2020-10-06T07:04:59Z) - Super-App Behavioral Patterns in Credit Risk Models: Financial,
Statistical and Regulatory Implications [110.54266632357673]
We present the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models.
Our results, validated across two countries, show that these new sources of data are particularly useful for predicting financial behavior in low-wealth and young individuals.
arXiv Detail & Related papers (2020-05-09T01:32:03Z) - Predicting Bank Loan Default with Extreme Gradient Boosting [0.0]
We use an Extreme Gradient Boosting algorithm called XGBoost for loan default prediction.
The prediction is based on a loan data from a leading bank taking into consideration data sets from both the loan application and the demographic of the applicant.
arXiv Detail & Related papers (2020-01-18T18:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.