Tensor-Empowered Asset Pricing with Missing Data
- URL: http://arxiv.org/abs/2508.01861v2
- Date: Sat, 20 Sep 2025 01:30:51 GMT
- Title: Tensor-Empowered Asset Pricing with Missing Data
- Authors: Junyi Mo, Jiayu Li, Duo Zhang, Elynn Chen,
- Abstract summary: We introduce an Adaptive, Cluster-based Temporal smoothing tensor completion framework (ACT-Tensor) for missing financial data panels.<n>ACT-Tensor consistently outperforms state-of-the-art benchmarks in terms of imputation accuracy across a range of missing data regimes.<n>Results show that ACT-Tensor not only achieves accurate return forecasting but also significantly improves risk-adjusted returns of the constructed portfolio.
- Score: 2.3404331106562677
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Missing data in financial panels presents a critical obstacle, undermining asset-pricing models and reducing the effectiveness of investment strategies. Such panels are often inherently multi-dimensional, spanning firms, time, and financial variables, which adds complexity to the imputation task. Conventional imputation methods often fail by flattening the data's multidimensional structure, struggling with heterogeneous missingness patterns, or overfitting in the face of extreme data sparsity. To address these limitations, we introduce an Adaptive, Cluster-based Temporal smoothing tensor completion framework (ACT-Tensor) tailored for severely and heterogeneously missing multi-dimensional financial data panels. ACT-Tensor incorporates two key innovations: a cluster-based completion module that captures cross-sectional heterogeneity by learning group-specific latent structures; and a temporal smoothing module that proactively removes short-lived noise while preserving slow-moving fundamental trends. Extensive experiments show that ACT-Tensor consistently outperforms state-of-the-art benchmarks in terms of imputation accuracy across a range of missing data regimes, including extreme sparsity scenarios. To assess its practical financial utility, we evaluate the imputed data with a latent factor model tailored for tensor-structured financial data. Results show that ACT-Tensor not only achieves accurate return forecasting but also significantly improves risk-adjusted returns of the constructed portfolio. These findings confirm that our method delivers highly accurate and informative imputations, offering substantial value for financial decision-making.
Related papers
- RFOD: Random Forest-based Outlier Detection for Tabular Data [12.469208664014472]
Outlier detection is crucial for safeguarding data integrity in high-stakes domains such as cybersecurity, financial fraud detection, and healthcare.<n>textsfRFOD reframes anomaly detection as a feature-wise conditional reconstruction problem.<n>textsfRFOD consistently outperforms state-of-the-art baselines in detection accuracy.
arXiv Detail & Related papers (2025-10-09T19:02:12Z) - TabINR: An Implicit Neural Representation Framework for Tabular Data Imputation [0.6407815281667869]
We introduce TabINR, an auto-decoder based Implicit Neural Representation framework that models tables as neural functions.<n>We evaluate our framework across a diverse range of twelve real-world datasets and multiple missingness mechanisms.
arXiv Detail & Related papers (2025-10-01T17:24:35Z) - Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z) - Dynamic Lagging for Time-Series Forecasting in E-Commerce Finance: Mitigating Information Loss with A Hybrid ML Architecture [0.8192992814374568]
We propose a hybrid forecasting framework that integrates dynamic lagged feature engineering and adaptive rolling-window representations.<n>Our approach explicitly incorporates invoice-level behavioral modeling, structured lag of support data, and custom stability-aware loss functions.
arXiv Detail & Related papers (2025-09-24T15:33:16Z) - Neutralizing Token Aggregation via Information Augmentation for Efficient Test-Time Adaptation [59.1067331268383]
Test-Time Adaptation (TTA) has emerged as an effective solution for adapting Vision Transformers (ViT) to distribution shifts without additional training data.<n>To reduce inference cost, plug-and-play token aggregation methods merge redundant tokens in ViTs to reduce total processed tokens.<n>We formalize this problem as Efficient Test-Time Adaptation (ETTA), seeking to preserve the adaptation capability of TTA while reducing inference latency.
arXiv Detail & Related papers (2025-08-05T12:40:55Z) - Quantifying the ROI of Cyber Threat Intelligence: A Data-Driven Approach [0.0]
This study introduces a data-driven methodology for quantifying the return on investment of Cyber Threat Intelligence.<n>The proposed framework extends established models in security economics to account for CTI's complex influence on both the probability of security breaches and the severity of associated losses.
arXiv Detail & Related papers (2025-07-23T15:54:56Z) - Stress-Testing ML Pipelines with Adversarial Data Corruption [11.91482648083998]
Regulators now demand evidence that high-stakes systems can withstand realistic, interdependent errors.<n>We introduce SAVAGE, a framework that formally models data-quality issues through dependency graphs and flexible corruption templates.<n>Savanage employs a bi-level optimization approach to efficiently identify vulnerable data subpopulations and fine-tune corruption severity.
arXiv Detail & Related papers (2025-06-02T00:41:24Z) - AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences.<n>Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation.<n>We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z) - FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting [58.70072722290475]
Financial time series (FinTS) record the behavior of human-brain-augmented decision-making.<n>FinTSB is a comprehensive and practical benchmark for financial time series forecasting.
arXiv Detail & Related papers (2025-02-26T05:19:16Z) - DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks.<n>We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge.<n>Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z) - Contrastive Learning of Asset Embeddings from Financial Time Series [8.595725772518332]
We propose a novel contrastive learning framework to generate asset embeddings from financial time series data.
Our approach leverages the similarity of asset returns over many subwindows to generate informative positive and negative samples.
Experiments on real-world datasets demonstrate the effectiveness of the learned asset embeddings on benchmark industry classification and portfolio optimization tasks.
arXiv Detail & Related papers (2024-07-26T10:26:44Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - Graph-Regularized Tensor Regression: A Domain-Aware Framework for
Interpretable Multi-Way Financial Modelling [23.030263841031633]
We develop a novel Graph-Regularized Regression (GRTR) framework, whereby knowledge about cross-asset relations is incorporated into the model in the form of a graph Laplacian matrix.
By virtue of tensor algebra, the proposed framework is shown to be fully interpretable, both coefficient-wise and dimension-wise.
The GRTR model is validated in a multi-way financial forecasting setting and is shown to achieve improved performance at reduced computational costs.
arXiv Detail & Related papers (2022-10-26T13:39:08Z) - Truncated tensor Schatten p-norm based approach for spatiotemporal
traffic data imputation with complicated missing patterns [77.34726150561087]
We introduce four complicated missing patterns, including missing and three fiber-like missing cases according to the mode-drivenn fibers.
Despite nonity of the objective function in our model, we derive the optimal solutions by integrating alternating data-mputation method of multipliers.
arXiv Detail & Related papers (2022-05-19T08:37:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.