The Impact of Pseudo-Science in Financial Loans Risk Prediction
- URL: http://arxiv.org/abs/2507.16182v2
- Date: Thu, 24 Jul 2025 06:01:17 GMT
- Title: The Impact of Pseudo-Science in Financial Loans Risk Prediction
- Authors: Bruno Scarone, Ricardo Baeza-Yates,
- Abstract summary: We study the societal impact of pseudo-scientific assumptions for predicting the behavior of people in a straightforward application of machine learning to risk prediction in financial lending.<n>We analyze the models in terms of their accuracy and social cost, showing that the socially optimal model may not imply a significant accuracy loss for this downstream task.
- Score: 5.764237203972864
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We study the societal impact of pseudo-scientific assumptions for predicting the behavior of people in a straightforward application of machine learning to risk prediction in financial lending. This use case also exemplifies the impact of survival bias in loan return prediction. We analyze the models in terms of their accuracy and social cost, showing that the socially optimal model may not imply a significant accuracy loss for this downstream task. Our results are verified for commonly used learning methods and datasets. Our findings also show that there is a natural dynamic when training models that suffer survival bias where accuracy slightly deteriorates, and whose recall and precision improves with time. These results act as an illusion, leading the observer to believe that the system is getting better, when in fact the model is suffering from increasingly more unfairness and survival bias.
Related papers
- Learning-Augmented Robust Algorithmic Recourse [7.217269034256654]
Algorithmic recourse provides suggestions of minimum-cost improvements to achieve a desirable outcome in the future.
Machine learning models often get updated over time and this can cause a recourse to become invalid.
We propose a novel algorithm for this problem, study the robustness-consistency trade-off, and analyze how prediction accuracy affects performance.
arXiv Detail & Related papers (2024-10-02T14:15:32Z) - Performative Prediction on Games and Mechanism Design [69.7933059664256]
We study a collective risk dilemma where agents decide whether to trust predictions based on past accuracy.<n>As predictions shape collective outcomes, social welfare arises naturally as a metric of concern.<n>We show how to achieve better trade-offs and use them for mechanism design.
arXiv Detail & Related papers (2024-08-09T16:03:44Z) - Imputation for prediction: beware of diminishing returns [12.424671213282256]
Missing values are prevalent across various fields, posing challenges for training and deploying predictive models.<n>Recent theoretical and empirical studies indicate that simple constant imputation can be consistent and competitive.<n>This study aims at clarifying if and when investing in advanced imputation methods yields significantly better predictions.
arXiv Detail & Related papers (2024-07-29T09:01:06Z) - Do We Really Even Need Data? [2.3749120526936465]
Researchers increasingly use predictions from pre-trained algorithms as outcome variables.
Standard tools for inference can misrepresent the association between independent variables and the outcome of interest when the true, unobserved outcome is replaced by a predicted value.
arXiv Detail & Related papers (2024-01-14T23:19:21Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Ecosystem-level Analysis of Deployed Machine Learning Reveals Homogeneous Outcomes [72.13373216644021]
We study the societal impact of machine learning by considering the collection of models that are deployed in a given context.
We find deployed machine learning is prone to systemic failure, meaning some users are exclusively misclassified by all models available.
These examples demonstrate ecosystem-level analysis has unique strengths for characterizing the societal impact of machine learning.
arXiv Detail & Related papers (2023-07-12T01:11:52Z) - Learning to Predict Trustworthiness with Steep Slope Loss [69.40817968905495]
We study the problem of predicting trustworthiness on real-world large-scale datasets.
We observe that the trustworthiness predictors trained with prior-art loss functions are prone to view both correct predictions and incorrect predictions to be trustworthy.
We propose a novel steep slope loss to separate the features w.r.t. correct predictions from the ones w.r.t. incorrect predictions by two slide-like curves that oppose each other.
arXiv Detail & Related papers (2021-09-30T19:19:09Z) - Learning Uncertainty with Artificial Neural Networks for Improved
Remaining Time Prediction of Business Processes [0.15229257192293202]
This paper is the first to apply these techniques to predictive process monitoring.
We found that they contribute towards more accurate predictions and work quickly.
This leads to many interesting applications, enables an earlier adoption of prediction systems with smaller datasets and fosters a better cooperation with humans.
arXiv Detail & Related papers (2021-05-12T10:18:57Z) - When Does Uncertainty Matter?: Understanding the Impact of Predictive
Uncertainty in ML Assisted Decision Making [68.19284302320146]
We carry out user studies to assess how people with differing levels of expertise respond to different types of predictive uncertainty.
We found that showing posterior predictive distributions led to smaller disagreements with the ML model's predictions.
This suggests that posterior predictive distributions can potentially serve as useful decision aids which should be used with caution and take into account the type of distribution and the expertise of the human.
arXiv Detail & Related papers (2020-11-12T02:23:53Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z) - Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data.
We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z) - From Predictions to Decisions: Using Lookahead Regularization [28.709041337894107]
We introduce look-ahead regularization which, by anticipating user actions, encourages predictive models to also induce actions that improve outcomes.
We report the results of experiments on real and synthetic data that show the effectiveness of this approach.
arXiv Detail & Related papers (2020-06-20T19:23:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.