Incorporating data drift to perform survival analysis on credit risk
- URL: http://arxiv.org/abs/2601.20533v1
- Date: Wed, 28 Jan 2026 12:22:08 GMT
- Title: Incorporating data drift to perform survival analysis on credit risk
- Authors: Jianwei Peng, Stefan Lessmann,
- Abstract summary: This study investigates the impact of data drift on survival-based credit risk models.<n>It proposes a dynamic joint modelling framework to improve robustness under non-stationary environments.
- Score: 5.250238356744497
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Survival analysis has become a standard approach for modelling time to default by time-varying covariates in credit risk. Unlike most existing methods that implicitly assume a stationary data-generating process, in practise, mortgage portfolios are exposed to various forms of data drift caused by changing borrower behaviour, macroeconomic conditions, policy regimes and so on. This study investigates the impact of data drift on survival-based credit risk models and proposes a dynamic joint modelling framework to improve robustness under non-stationary environments. The proposed model integrates a longitudinal behavioural marker derived from balance dynamics with a discrete-time hazard formulation, combined with landmark one-hot encoding and isotonic calibration. Three types of data drift (sudden, incremental and recurring) are simulated and analysed on mortgage loan datasets from Freddie Mac. Experiments and corresponding evidence show that the proposed landmark-based joint model consistently outperforms classical survival models, tree-based drift-adaptive learners and gradient boosting methods in terms of discrimination and calibration across all drift scenarios, which confirms the superiority of our model design.
Related papers
- Temporal-Aligned Meta-Learning for Risk Management: A Stacking Approach for Multi-Source Credit Scoring [0.0]
This paper presents a meta-learning framework for credit risk assessment of Italian Small and Medium Enterprises (SMEs)<n>The approach aligns financial statement reference dates with evaluation dates, mitigating bias arising from publication delays and asynchronous data sources.<n> Empirical validation shows that the framework effectively captures credit risk evolution over time, improving temporal consistency and predictive stability relative to standard ensemble methods.
arXiv Detail & Related papers (2026-01-12T14:36:54Z) - TraCeR: Transformer-Based Competing Risk Analysis with Longitudinal Covariates [0.0]
TraCeR is a transformer-based survival analysis framework.<n>It estimates the hazard function from a sequence of measurements.<n>Experiments on multiple real-world datasets demonstrate substantial and statistically significant performance improvements.
arXiv Detail & Related papers (2025-12-19T23:24:47Z) - Counterfactual Probabilistic Diffusion with Expert Models [44.96279296893773]
We propose a time series diffusion-based framework that incorporates guidance from imperfect expert models.<n>Our method, ODE-Diff, bridges mechanistic and data-driven approaches, enabling more reliable and interpretable causal inference.
arXiv Detail & Related papers (2025-08-18T20:44:32Z) - Bayesian Models for Joint Selection of Features and Auto-Regressive Lags: Theory and Applications in Environmental and Financial Forecasting [0.9208007322096533]
We develop a Bayesian framework for variable selection in linear regression with autocorrelated errors.<n>Our framework achieves lower MSPE, improved true model component identification, and greater consistency with autocorrelated noise.<n>Compared to existing methods, our framework achieves lower MSPE, improved true model component identification, and greater consistency with autocorrelated noise.
arXiv Detail & Related papers (2025-08-12T18:44:36Z) - Frugal, Flexible, Faithful: Causal Data Simulation via Frengression [4.446798246007668]
We introduce frengression, a deep generative realization of the frugal parameterization.<n>frengression provides accurate estimation and flexible, faithful simulation of time-varying data.<n>We envision this framework sparking new research into generative approaches for causal margin modelling.
arXiv Detail & Related papers (2025-08-01T18:43:59Z) - On conditional diffusion models for PDE simulations [53.01911265639582]
We study score-based diffusion models for forecasting and assimilation of sparse observations.
We propose an autoregressive sampling approach that significantly improves performance in forecasting.
We also propose a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths.
arXiv Detail & Related papers (2024-10-21T18:31:04Z) - A Spatio-Temporal Machine Learning Model for Mortgage Credit Risk: Default Probabilities and Loan Portfolios [11.141688859736805]
We introduce a machine learning model for credit risk by combining tree-boosting with a latent-temporal and Gaussian process model accounting for frailty correlation.<n>We find that both predictive default probabilities for individual loans and predictive loan portfolio loss distributions are more accurate compared to conventional independent linear hazard models.
arXiv Detail & Related papers (2024-10-03T15:10:55Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - The Risk of Federated Learning to Skew Fine-Tuning Features and
Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model.
We introduce three robustness indicators and conduct experiments across diverse robust datasets.
Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Generative Temporal Difference Learning for Infinite-Horizon Prediction [101.59882753763888]
We introduce the $gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.
We discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors.
arXiv Detail & Related papers (2020-10-27T17:54:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.