Related papers: Predicting and Explaining Customer Data Sharing in the Open Banking

Predicting and Explaining Customer Data Sharing in the Open Banking

URL: http://arxiv.org/abs/2507.01987v1
Date: Sat, 28 Jun 2025 01:24:59 GMT
Title: Predicting and Explaining Customer Data Sharing in the Open Banking
Authors: João B. G. de Brito, Rodrigo Heldt, Cleo S. Silveira, Matthias Bogaert, Guilherme B. Bucco, Fernando B. Luce, João L. Becker, Filipe J. Zabala, Michel J. Anzanello,
Abstract summary: This study introduces a framework to predict customers' propensity to share data via Open Banking and interprets this behavior through Explanatory Model Analysis (EMA)<n>Using data from a large Brazilian financial institution with approximately 3.2 million customers, a hybrid data balancing strategy was employed to address the infrequency of data sharing.<n>These models accurately predicted customer data sharing, achieving 91.39% accuracy for inflow and 91.53% for outflow.
Score: 34.337412054122076
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The emergence of Open Banking represents a significant shift in financial data management, influencing financial institutions' market dynamics and marketing strategies. This increased competition creates opportunities and challenges, as institutions manage data inflow to improve products and services while mitigating data outflow that could aid competitors. This study introduces a framework to predict customers' propensity to share data via Open Banking and interprets this behavior through Explanatory Model Analysis (EMA). Using data from a large Brazilian financial institution with approximately 3.2 million customers, a hybrid data balancing strategy incorporating ADASYN and NEARMISS techniques was employed to address the infrequency of data sharing and enhance the training of XGBoost models. These models accurately predicted customer data sharing, achieving 91.39% accuracy for inflow and 91.53% for outflow. The EMA phase combined the Shapley Additive Explanations (SHAP) method with the Classification and Regression Tree (CART) technique, revealing the most influential features on customer decisions. Key features included the number of transactions and purchases in mobile channels, interactions within these channels, and credit-related features, particularly credit card usage across the national banking system. These results highlight the critical role of mobile engagement and credit in driving customer data-sharing behaviors, providing financial institutions with strategic insights to enhance competitiveness and innovation in the Open Banking environment.

Related papers

Bayesian Regression for Predicting Subscription to Bank Term Deposits in Direct Marketing Campaigns [0.0]
The purpose of this research is to examine the efficacy of logit and probit models in predicting term deposit subscriptions. The target variable was balanced, considering the inherent imbalance in the dataset. The logit model performed better than the probit model in handling this classification problem.
arXiv Detail & Related papers (2024-10-28T21:04:58Z)
Exploratory Data Analysis for Banking and Finance: Unveiling Insights and Patterns [0.2594420805049218]
The study examines transaction patterns, credit limits, and usage across merchant categories. It also considers demographic factors like age, gender, and income on usage patterns. The report addresses customer churning, analyzing churn rates and factors such as demographics, transaction history, and satisfaction levels.
arXiv Detail & Related papers (2024-05-25T16:15:21Z)
FedCAda: Adaptive Client-Side Optimization for Accelerated and Stable Federated Learning [57.38427653043984]
Federated learning (FL) has emerged as a prominent approach for collaborative training of machine learning models across distributed clients. We introduce FedCAda, an innovative federated client adaptive algorithm designed to tackle this challenge. We demonstrate that FedCAda outperforms the state-of-the-art methods in terms of adaptability, convergence, stability, and overall performance.
arXiv Detail & Related papers (2024-05-20T06:12:33Z)
The Effects of Data Imbalance Under a Federated Learning Approach for Credit Risk Forecasting [0.0]
Credit risk forecasting plays a crucial role for commercial banks and other financial institutions in granting loans to customers. Traditional machine learning methods require the sharing of sensitive client information with an external server to build a global model. A newly developed privacy-preserving distributed machine learning technique known as Federated Learning (FL) allows the training of a global model without the necessity of accessing private local data directly.
arXiv Detail & Related papers (2024-01-14T09:15:10Z)
Personalized Federated Learning with Attention-based Client Selection [57.71009302168411]
We propose FedACS, a new PFL algorithm with an Attention-based Client Selection mechanism. FedACS integrates an attention mechanism to enhance collaboration among clients with similar data distributions. Experiments on CIFAR10 and FMNIST validate FedACS's superiority.
arXiv Detail & Related papers (2023-12-23T03:31:46Z)
FedToken: Tokenized Incentives for Data Contribution in Federated Learning [33.93936816356012]
We propose a contribution-based tokenized incentive scheme, namely textttFedToken, backed by blockchain technology. We first approximate the contribution of local models during model aggregation, then strategically schedule clients lowering the communication rounds for convergence.
arXiv Detail & Related papers (2022-09-20T14:58:08Z)
Dynamic Attention-based Communication-Efficient Federated Learning [85.18941440826309]
Federated learning (FL) offers a solution to train a global machine learning model. FL suffers performance degradation when client data distribution is non-IID. We propose a new adaptive training algorithm $textttAdaFL$ to combat this degradation.
arXiv Detail & Related papers (2021-08-12T14:18:05Z)
Enhancing User' s Income Estimation with Super-App Alternative Data [59.60094442546867]
It compares the performance of these alternative data sources with the performance of industry-accepted bureau income estimators. Ultimately, this paper shows the incentive for financial institutions to seek to incorporate alternative data into constructing their risk profiles.
arXiv Detail & Related papers (2021-04-12T21:34:44Z)
Supporting Financial Inclusion with Graph Machine Learning and Super-App Alternative Data [63.942632088208505]
Super-Apps have changed the way we think about the interactions between users and commerce. This paper investigates how different interactions between users within a Super-App provide a new source of information to predict borrower behavior.
arXiv Detail & Related papers (2021-02-19T15:13:06Z)
Towards Intelligent Risk-based Customer Segmentation in Banking [0.0]
We present an intelligent data-driven pipeline composed of a set of processing elements to move customers' data from one system to another. The goal is to present a novel intelligent customer segmentation process which automates the feature engineering, i.e., the process of using (banking) domain knowledge to extract features from raw data. Our proposed method is able to achieve accuracy of 91% compared to classical approaches in terms of detecting, identifying and classifying transaction to the right classification.
arXiv Detail & Related papers (2020-09-29T11:22:04Z)
Super-App Behavioral Patterns in Credit Risk Models: Financial, Statistical and Regulatory Implications [110.54266632357673]
We present the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models. Our results, validated across two countries, show that these new sources of data are particularly useful for predicting financial behavior in low-wealth and young individuals.
arXiv Detail & Related papers (2020-05-09T01:32:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.