Related papers: Machine Learning for Temporal Data in Finance: Challenges and Opportunities

Machine Learning for Temporal Data in Finance: Challenges and Opportunities

URL: http://arxiv.org/abs/2009.05636v1
Date: Fri, 11 Sep 2020 19:39:27 GMT
Title: Machine Learning for Temporal Data in Finance: Challenges and Opportunities
Authors: Jason Wittenbach, Brian d'Alessandro, C. Bayan Bruss
Abstract summary: Temporal data are ubiquitous in the financial services (FS) industry. But machine learning efforts often fail to account for the temporal richness of these data.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Temporal data are ubiquitous in the financial services (FS) industry -- traditional data like economic indicators, operational data such as bank account transactions, and modern data sources like website clickstreams -- all of these occur as a time-indexed sequence. But machine learning efforts in FS often fail to account for the temporal richness of these data, even in cases where domain knowledge suggests that the precise temporal patterns between events should contain valuable information. At best, such data are often treated as uniform time series, where there is a sequence but no sense of exact timing. At worst, rough aggregate features are computed over a pre-selected window so that static sample-based approaches can be applied (e.g. number of open lines of credit in the previous year or maximum credit utilization over the previous month). Such approaches are at odds with the deep learning paradigm which advocates for building models that act directly on raw or lightly processed data and for leveraging modern optimization techniques to discover optimal feature transformations en route to solving the modeling task at hand. Furthermore, a full picture of the entity being modeled (customer, company, etc.) might only be attainable by examining multiple data streams that unfold across potentially vastly different time scales. In this paper, we examine the different types of temporal data found in common FS use cases, review the current machine learning approaches in this area, and finally assess challenges and opportunities for researchers working at the intersection of machine learning for temporal data and applications in FS.

Related papers

Learning-Augmented Moment Estimation on Time-Decay Models [55.06256430461023]
We use an oracle for the heavy-hitters of datasets to give learning-augmented algorithms for a number of fundamental problems.<n>We complement our theoretical results with a number of empirical evaluations that demonstrate the practical efficiency of our algorithms on real and synthetic datasets.
arXiv Detail & Related papers (2026-03-03T00:42:34Z)
Evaluating Transfer Learning Methods on Real-World Data Streams: A Case Study in Financial Fraud Detection [4.689506737427387]
When the available data for a target domain is limited, transfer learning (TL) methods can be used to develop models on related data-rich domains.<n>We propose a data manipulation framework that simulates varying data availability scenarios over time.<n>We demonstrate the usefulness of the proposed framework by performing a case study on a proprietary real-world suite of card payment datasets.
arXiv Detail & Related papers (2025-07-29T14:12:21Z)
Beyond Data Scarcity: A Frequency-Driven Framework for Zero-Shot Forecasting [15.431513584239047]
Time series forecasting is critical in numerous real-world applications. Traditional forecasting techniques struggle when data is scarce or not available at all. Recent advancements often leverage large-scale foundation models for such tasks.
arXiv Detail & Related papers (2024-11-24T07:44:39Z)
Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting. Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server. We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z)
UniCL: A Universal Contrastive Learning Framework for Large Time Series Models [18.005358506435847]
Time-series analysis plays a pivotal role across a range of critical applications, from finance to healthcare. Traditional supervised learning methods first annotate extensive labels for time-series data in each task. This paper introduces UniCL, a universal and scalable contrastive learning framework designed for pretraining time-series foundation models.
arXiv Detail & Related papers (2024-05-17T07:47:11Z)
Probing the Robustness of Time-series Forecasting Models with CounterfacTS [1.823020744088554]
We present and publicly release CounterfacTS, a tool to probe the robustness of deep learning models in time-series forecasting tasks. CounterfacTS has a user-friendly interface that allows the user to visualize, compare and quantify time series data and their forecasts.
arXiv Detail & Related papers (2024-03-06T07:34:47Z)
Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook [95.32949323258251]
Temporal data, notably time series andtemporal-temporal data, are prevalent in real-world applications. Recent advances in large language and other foundational models have spurred increased use in time series andtemporal data mining.
arXiv Detail & Related papers (2023-10-16T09:06:00Z)
TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations. We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z)
PIETS: Parallelised Irregularity Encoders for Forecasting with Heterogeneous Time-Series [5.911865723926626]
Heterogeneity and irregularity of multi-source data sets present a significant challenge to time-series analysis. In this work, we design a novel architecture, PIETS, to model heterogeneous time-series. We show that PIETS is able to effectively model heterogeneous temporal data and outperforms other state-of-the-art approaches in the prediction task.
arXiv Detail & Related papers (2021-09-30T20:01:19Z)
PSEUDo: Interactive Pattern Search in Multivariate Time Series with Locality-Sensitive Hashing and Relevance Feedback [3.347485580830609]
PSEUDo is an adaptive feature learning technique for exploring visual patterns in multi-track sequential data. Our algorithm features sub-linear training and inference time. We demonstrate superiority of PSEUDo in terms of efficiency, accuracy, and steerability.
arXiv Detail & Related papers (2021-04-30T13:00:44Z)
Time-Series Imputation with Wasserstein Interpolation for Optimal Look-Ahead-Bias and Variance Tradeoff [66.59869239999459]
In finance, imputation of missing returns may be applied prior to training a portfolio optimization model. There is an inherent trade-off between the look-ahead-bias of using the full data set for imputation and the larger variance in the imputation from using only the training data. We propose a Bayesian posterior consensus distribution which optimally controls the variance and look-ahead-bias trade-off in the imputation.
arXiv Detail & Related papers (2021-02-25T09:05:35Z)
Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies. THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin. We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z)
DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a Trained Classifier [58.979104709647295]
We bridge the gap between the abundance of available data and lack of relevant data, for the future learning tasks of a trained network. We use the available data, that may be an imbalanced subset of the original training dataset, or a related domain dataset, to retrieve representative samples. We demonstrate that data from a related domain can be leveraged to achieve state-of-the-art performance.
arXiv Detail & Related papers (2019-12-27T02:05:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.