ProteuS: A Generative Approach for Simulating Concept Drift in Financial Markets
- URL: http://arxiv.org/abs/2509.11844v1
- Date: Sat, 30 Aug 2025 21:01:47 GMT
- Title: ProteuS: A Generative Approach for Simulating Concept Drift in Financial Markets
- Authors: Andrés L. Suárez-Cetrulo, Alejandro Cervantes, David Quintana,
- Abstract summary: A fundamental problem in developing and validating adaptive algorithms is the lack of a ground truth in real-world financial data.<n>This paper introduces a novel framework, named ProteuS, for generating semi-synthetic financial time series with pre-defined structural breaks.<n>An analysis of the generated data confirms the complexity of the task, revealing significant overlap between the different market states.
- Score: 44.76567557906836
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Financial markets are complex, non-stationary systems where the underlying data distributions can shift over time, a phenomenon known as regime changes, as well as concept drift in the machine learning literature. These shifts, often triggered by major economic events, pose a significant challenge for traditional statistical and machine learning models. A fundamental problem in developing and validating adaptive algorithms is the lack of a ground truth in real-world financial data, making it difficult to evaluate a model's ability to detect and recover from these drifts. This paper addresses this challenge by introducing a novel framework, named ProteuS, for generating semi-synthetic financial time series with pre-defined structural breaks. Our methodology involves fitting ARMA-GARCH models to real-world ETF data to capture distinct market regimes, and then simulating realistic, gradual, and abrupt transitions between them. The resulting datasets, which include a comprehensive set of technical indicators, provide a controlled environment with a known ground truth of regime changes. An analysis of the generated data confirms the complexity of the task, revealing significant overlap between the different market states. We aim to provide the research community with a tool for the rigorous evaluation of concept drift detection and adaptation mechanisms, paving the way for more robust financial forecasting models.
Related papers
- ASTIF: Adaptive Semantic-Temporal Integration for Cryptocurrency Price Forecasting [6.12055122337183]
ASTIF is a hybrid intelligent system that adapts its forecasting strategy in real time through confidence-based meta-learning.<n>A confidence-aware meta-learner functions as an adaptive inference layer, modulating each predictor's contribution based on its real-time uncertainty.<n>The research contributes a scalable, knowledge-based solution for fusing quantitative and qualitative data in non-stationary environments.
arXiv Detail & Related papers (2025-12-21T09:17:36Z) - Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z) - From Physics to Machine Learning and Back: Part II - Learning and Observational Bias in PHM [52.64097278841485]
Review examines how incorporating learning and observational biases through physics-informed modeling and data strategies can guide models toward physically consistent and reliable predictions.<n>Fast adaptation methods including meta-learning and few-shot learning are reviewed alongside domain generalization techniques.
arXiv Detail & Related papers (2025-09-25T14:15:43Z) - Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z) - Cross-Modal Temporal Fusion for Financial Market Forecasting [3.0756278306759635]
We introduce a transformer-based deep learning framework, Cross-Modal Temporal Fusion (CMTF), that fuses structured and unstructured financial data for improved market prediction.<n> Experimental results using FTSE 100 stock data demonstrate that CMTF achieves superior performance in price direction classification compared to classical and deep learning baselines.
arXiv Detail & Related papers (2025-04-18T07:20:18Z) - Financial Wind Tunnel: A Retrieval-Augmented Market Simulator [8.687612511755836]
Market simulator tries to create high-quality synthetic financial data that mimics real-world market dynamics.<n>Financial Wind Tunnel (FWT) is a retrieval-augmented market simulator designed to generate controllable, reasonable, and adaptable market dynamics.<n>FWT offers a more comprehensive and systematic generative capability across different data frequencies.
arXiv Detail & Related papers (2025-03-23T03:10:13Z) - Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models [104.17057231661371]
Time series analysis is crucial for understanding dynamics of complex systems.<n>Recent advances in foundation models have led to task-agnostic Time Series Foundation Models (TSFMs) and Large Language Model-based Time Series Models (TSLLMs)<n>Their success depends on large, diverse, and high-quality datasets, which are challenging to build due to regulatory, diversity, quality, and quantity constraints.<n>This survey provides a comprehensive review of synthetic data for TSFMs and TSLLMs, analyzing data generation strategies, their role in model pretraining, fine-tuning, and evaluation, and identifying future research directions.
arXiv Detail & Related papers (2025-03-14T13:53:46Z) - FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting [58.70072722290475]
Financial time series (FinTS) record the behavior of human-brain-augmented decision-making.<n>FinTSB is a comprehensive and practical benchmark for financial time series forecasting.
arXiv Detail & Related papers (2025-02-26T05:19:16Z) - A Deep Learning Framework Integrating CNN and BiLSTM for Financial Systemic Risk Analysis and Prediction [17.6825558707504]
This study proposes a deep learning model based on the combination of convolutional neural network (CNN) and bidirectional long short-term memory network (BiLSTM)<n>The model first uses CNN to extract local patterns of multidimensional features of financial markets, and then models the bidirectional dependency of time series through BiLSTM.<n>The results show that the model is significantly superior to traditional single models in terms of accuracy, recall, and F1 score.
arXiv Detail & Related papers (2025-02-07T07:57:11Z) - Advanced Risk Prediction and Stability Assessment of Banks Using Time Series Transformer Models [10.79035001851989]
This paper proposes a prediction framework based on the Time Series Transformer model.<n>We compare the model with LSTM, GRU, CNN, TCN and RNN-Transformer models.<n>The experimental results show that the Time Series Transformer model outperforms other models in both mean square error (MSE) and mean absolute error (MAE) evaluation indicators.
arXiv Detail & Related papers (2024-12-04T08:15:27Z) - Market-GAN: Adding Control to Financial Market Data Generation with
Semantic Context [23.773217528211905]
Current financial datasets do not contain context labels.
Current techniques are not designed to generate financial data with context as control.
Market-GAN is a novel architecture incorporating a Generative Adversarial Networks (GAN) for the controllable generation with context.
arXiv Detail & Related papers (2023-09-14T13:42:27Z) - Bayesian Bilinear Neural Network for Predicting the Mid-price Dynamics
in Limit-Order Book Markets [84.90242084523565]
Traditional time-series econometric methods often appear incapable of capturing the true complexity of the multi-level interactions driving the price dynamics.
By adopting a state-of-the-art second-order optimization algorithm, we train a Bayesian bilinear neural network with temporal attention.
By addressing the use of predictive distributions to analyze errors and uncertainties associated with the estimated parameters and model forecasts, we thoroughly compare our Bayesian model with traditional ML alternatives.
arXiv Detail & Related papers (2022-03-07T18:59:54Z) - Gaussian process imputation of multiple financial series [71.08576457371433]
Multiple time series such as financial indicators, stock prices and exchange rates are strongly coupled due to their dependence on the latent state of the market.
We focus on learning the relationships among financial time series by modelling them through a multi-output Gaussian process.
arXiv Detail & Related papers (2020-02-11T19:18:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.