Machine Learning for Temporal Data in Finance: Challenges and
Opportunities
- URL: http://arxiv.org/abs/2009.05636v1
- Date: Fri, 11 Sep 2020 19:39:27 GMT
- Title: Machine Learning for Temporal Data in Finance: Challenges and
Opportunities
- Authors: Jason Wittenbach, Brian d'Alessandro, C. Bayan Bruss
- Abstract summary: Temporal data are ubiquitous in the financial services (FS) industry.
But machine learning efforts often fail to account for the temporal richness of these data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Temporal data are ubiquitous in the financial services (FS) industry --
traditional data like economic indicators, operational data such as bank
account transactions, and modern data sources like website clickstreams -- all
of these occur as a time-indexed sequence. But machine learning efforts in FS
often fail to account for the temporal richness of these data, even in cases
where domain knowledge suggests that the precise temporal patterns between
events should contain valuable information. At best, such data are often
treated as uniform time series, where there is a sequence but no sense of exact
timing. At worst, rough aggregate features are computed over a pre-selected
window so that static sample-based approaches can be applied (e.g. number of
open lines of credit in the previous year or maximum credit utilization over
the previous month). Such approaches are at odds with the deep learning
paradigm which advocates for building models that act directly on raw or
lightly processed data and for leveraging modern optimization techniques to
discover optimal feature transformations en route to solving the modeling task
at hand. Furthermore, a full picture of the entity being modeled (customer,
company, etc.) might only be attainable by examining multiple data streams that
unfold across potentially vastly different time scales. In this paper, we
examine the different types of temporal data found in common FS use cases,
review the current machine learning approaches in this area, and finally assess
challenges and opportunities for researchers working at the intersection of
machine learning for temporal data and applications in FS.
Related papers
- UniCL: A Universal Contrastive Learning Framework for Large Time Series Models [18.005358506435847]
Time-series analysis plays a pivotal role across a range of critical applications, from finance to healthcare.
Traditional supervised learning methods first annotate extensive labels for time-series data in each task.
This paper introduces UniCL, a universal and scalable contrastive learning framework designed for pretraining time-series foundation models.
arXiv Detail & Related papers (2024-05-17T07:47:11Z) - Probing the Robustness of Time-series Forecasting Models with
CounterfacTS [1.823020744088554]
We present and publicly release CounterfacTS, a tool to probe the robustness of deep learning models in time-series forecasting tasks.
CounterfacTS has a user-friendly interface that allows the user to visualize, compare and quantify time series data and their forecasts.
arXiv Detail & Related papers (2024-03-06T07:34:47Z) - Large Models for Time Series and Spatio-Temporal Data: A Survey and
Outlook [95.32949323258251]
Temporal data, notably time series andtemporal-temporal data, are prevalent in real-world applications.
Recent advances in large language and other foundational models have spurred increased use in time series andtemporal data mining.
arXiv Detail & Related papers (2023-10-16T09:06:00Z) - TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations.
We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z) - Time-Varying Propensity Score to Bridge the Gap between the Past and Present [104.46387765330142]
We introduce a time-varying propensity score that can detect gradual shifts in the distribution of data.
We demonstrate different ways of implementing it and evaluate it on a variety of problems.
arXiv Detail & Related papers (2022-10-04T07:21:49Z) - Using Time-Series Privileged Information for Provably Efficient Learning
of Prediction Models [6.7015527471908625]
We study prediction of future outcomes with supervised models that use privileged information during learning.
privileged information comprises samples of time series observed between the baseline time of prediction and the future outcome.
We show that our approach is generally preferable to classical learning, particularly when data is scarce.
arXiv Detail & Related papers (2021-10-28T10:07:29Z) - PIETS: Parallelised Irregularity Encoders for Forecasting with
Heterogeneous Time-Series [5.911865723926626]
Heterogeneity and irregularity of multi-source data sets present a significant challenge to time-series analysis.
In this work, we design a novel architecture, PIETS, to model heterogeneous time-series.
We show that PIETS is able to effectively model heterogeneous temporal data and outperforms other state-of-the-art approaches in the prediction task.
arXiv Detail & Related papers (2021-09-30T20:01:19Z) - PSEUDo: Interactive Pattern Search in Multivariate Time Series with
Locality-Sensitive Hashing and Relevance Feedback [3.347485580830609]
PSEUDo is an adaptive feature learning technique for exploring visual patterns in multi-track sequential data.
Our algorithm features sub-linear training and inference time.
We demonstrate superiority of PSEUDo in terms of efficiency, accuracy, and steerability.
arXiv Detail & Related papers (2021-04-30T13:00:44Z) - Time-Series Imputation with Wasserstein Interpolation for Optimal
Look-Ahead-Bias and Variance Tradeoff [66.59869239999459]
In finance, imputation of missing returns may be applied prior to training a portfolio optimization model.
There is an inherent trade-off between the look-ahead-bias of using the full data set for imputation and the larger variance in the imputation from using only the training data.
We propose a Bayesian posterior consensus distribution which optimally controls the variance and look-ahead-bias trade-off in the imputation.
arXiv Detail & Related papers (2021-02-25T09:05:35Z) - Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies.
THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin.
We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z) - DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a
Trained Classifier [58.979104709647295]
We bridge the gap between the abundance of available data and lack of relevant data, for the future learning tasks of a trained network.
We use the available data, that may be an imbalanced subset of the original training dataset, or a related domain dataset, to retrieve representative samples.
We demonstrate that data from a related domain can be leveraged to achieve state-of-the-art performance.
arXiv Detail & Related papers (2019-12-27T02:05:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.