Related papers: From Patterns to Predictions: A Shapelet-Based Framework for Directional Forecasting in Noisy Financial Markets

From Patterns to Predictions: A Shapelet-Based Framework for Directional Forecasting in Noisy Financial Markets

URL: http://arxiv.org/abs/2509.15040v1
Date: Thu, 18 Sep 2025 15:05:27 GMT
Title: From Patterns to Predictions: A Shapelet-Based Framework for Directional Forecasting in Noisy Financial Markets
Authors: Juwon Kim, Hyunwook Lee, Hyotaek Jeon, Seungmin Jin, Sungahn Ko,
Abstract summary: Directional forecasting in financial markets requires both accuracy and interpretability.<n>We propose a two-stage framework that integrates unsupervised pattern extracion with interpretable forecasting.<n>Our approach enables transparent decision-making by revealing the underlying pattern structures that drive predictive outcomes.
Score: 8.168261768703621
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Directional forecasting in financial markets requires both accuracy and interpretability. Before the advent of deep learning, interpretable approaches based on human-defined patterns were prevalent, but their structural vagueness and scale ambiguity hindered generalization. In contrast, deep learning models can effectively capture complex dynamics, yet often offer limited transparency. To bridge this gap, we propose a two-stage framework that integrates unsupervised pattern extracion with interpretable forecasting. (i) SIMPC segments and clusters multivariate time series, extracting recurrent patterns that are invariant to amplitude scaling and temporal distortion, even under varying window sizes. (ii) JISC-Net is a shapelet-based classifier that uses the initial part of extracted patterns as input and forecasts subsequent partial sequences for short-term directional movement. Experiments on Bitcoin and three S&P 500 equities demonstrate that our method ranks first or second in 11 out of 12 metric--dataset combinations, consistently outperforming baselines. Unlike conventional deep learning models that output buy-or-sell signals without interpretable justification, our approach enables transparent decision-making by revealing the underlying pattern structures that drive predictive outcomes.

Related papers

How to model Human Actions distribution with Event Sequence Data [22.25731364559209]
We study the forecasting of the future distribution of events in human action sequences.<n>We find that a simple explicit distribution forecasting objective consistently surpasses complex implicit baselines.<n>This work provides a principled framework for selecting modeling strategies and offers practical guidance for building more accurate and robust forecasting systems.
arXiv Detail & Related papers (2025-10-07T12:24:54Z)
ProtoTS: Learning Hierarchical Prototypes for Explainable Time Series Forecasting [25.219624871510376]
We propose ProtoTS, a novel interpretable forecasting framework that achieves both high accuracy and transparent decision-making.<n>ProtoTS computes instance-prototype similarity based on a denoised representation that preserves abundant heterogeneous information.<n> Experiments on multiple realistic benchmarks, including a newly released LOF dataset, show that ProtoTS not only exceeds existing methods in forecast accuracy but also delivers expert-steerable interpretations.
arXiv Detail & Related papers (2025-09-27T07:10:21Z)
Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching [9.465542901469815]
Conditional Guided Flow Matching (CGFM) is a model-agnostic framework that extends flow matching by integrating outputs from an auxiliary predictive model.<n>CGFM incorporates historical data as both conditions and guidance, uses two-sided conditional paths, and employs affine paths to expand the path space.<n> Experiments across datasets and baselines show CGFM consistently outperforms state-of-the-art models, advancing forecasting.
arXiv Detail & Related papers (2025-07-09T18:03:31Z)
How Far Are We from Generating Missing Modalities with Foundation Models? [49.425856207329524]
We propose an agentic framework tailored for missing modality reconstruction.<n>Our method reduces FID for missing image reconstruction by at least 14% and MER for missing text reconstruction by at least 10% compared to baselines.
arXiv Detail & Related papers (2025-06-04T03:22:44Z)
Signal in the Noise: Polysemantic Interference Transfers and Predicts Cross-Model Influence [46.548276232795466]
Polysemanticity is pervasive in language models and remains a major challenge for interpretation and model behavioral control.<n>We map the polysemantic topology of two small models to identify feature pairs that are semantically unrelated yet exhibit interference within models.<n>We intervene at four loci (prompt, token, feature, neuron) and measure induced shifts in the next-token prediction distribution, uncovering polysemantic structures that expose a systematic vulnerability in these models.
arXiv Detail & Related papers (2025-05-16T18:20:42Z)
Context parroting: A simple but tough-to-beat baseline for foundation models in scientific machine learning [6.881798277354561]
We show that foundation models often forecast through a simple parroting strategy.<n>A naive context parroting model that copies directly from the context scores higher than leading time-series foundation models.
arXiv Detail & Related papers (2025-05-16T15:14:47Z)
LLM4FTS: Enhancing Large Language Models for Financial Time Series Prediction [0.0]
Traditional machine learning models exhibit limitations in this forecasting task constrained by their restricted model capacity.<n>We propose $LLM4FTS$, a novel framework that enhances temporal sequence modeling through learnable patch segmentation and dynamic wavelet convolution modules.<n>Experiments on real-world financial datasets substantiate the framework's efficacy, demonstrating superior performance in capturing complex market patterns and achieving state-of-the-art results in stock return prediction.
arXiv Detail & Related papers (2025-05-05T06:48:34Z)
Topology-Aware Conformal Prediction for Stream Networks [68.02503121089633]
We propose Spatio-Temporal Adaptive Conformal Inference (textttCISTA), a novel framework that integrates network topology and temporal dynamics into the conformal prediction framework.<n>Our results show that textttCISTA effectively balances prediction efficiency and coverage, outperforming existing conformal prediction methods for stream networks.
arXiv Detail & Related papers (2025-03-06T21:21:15Z)
DisenTS: Disentangled Channel Evolving Pattern Modeling for Multivariate Time Series Forecasting [43.071713191702486]
DisenTS is a tailored framework for modeling disentangled channel evolving patterns in general time series forecasting. We introduce a novel Forecaster Aware Gate (FAG) module that generates the routing signals adaptively according to both the forecasters' states and input series' characteristics.
arXiv Detail & Related papers (2024-10-30T12:46:14Z)
Regularized Neural Ensemblers [55.15643209328513]
In this study, we explore employing regularized neural networks as ensemble methods.<n>Motivated by the risk of learning low-diversity ensembles, we propose regularizing the ensembling model by randomly dropping base model predictions.<n>We demonstrate this approach provides lower bounds for the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z)
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion [61.03681839276652]
Diffusion Forcing is a new training paradigm where a diffusion model is trained to denoise a set of tokens with independent per-token noise levels.<n>We apply Diffusion Forcing to sequence generative modeling by training a causal next-token prediction model to generate one or several future tokens.
arXiv Detail & Related papers (2024-07-01T15:43:25Z)
Towards Robust and Adaptive Motion Forecasting: A Causal Representation Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables. We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph. Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z)
Explaining a Series of Models by Propagating Local Feature Attributions [9.66840768820136]
Pipelines involving several machine learning models improve performance in many domains but are difficult to understand. We introduce a framework to propagate local feature attributions through complex pipelines of models based on a connection to the Shapley value. Our framework enables us to draw higher-level conclusions based on groups of gene expression features for Alzheimer's and breast cancer histologic grade prediction.
arXiv Detail & Related papers (2021-04-30T22:20:58Z)
Generative Temporal Difference Learning for Infinite-Horizon Prediction [101.59882753763888]
We introduce the $gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon. We discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors.
arXiv Detail & Related papers (2020-10-27T17:54:12Z)
Understanding Neural Abstractive Summarization Models via Uncertainty [54.37665950633147]
seq2seq abstractive summarization models generate text in a free-form manner. We study the entropy, or uncertainty, of the model's token-level predictions. We show that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
arXiv Detail & Related papers (2020-10-15T16:57:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.