Feature Selection with Annealing for Forecasting Financial Time Series
- URL: http://arxiv.org/abs/2303.02223v3
- Date: Fri, 23 Feb 2024 13:16:25 GMT
- Title: Feature Selection with Annealing for Forecasting Financial Time Series
- Authors: Hakan Pabuccu, Adrian Barbu
- Abstract summary: This study provides a comprehensive method for forecasting financial time series based on tactical input output feature mapping techniques using machine learning (ML) models.
Experiments indicate that the FSA algorithm increased the performance of ML models, regardless of problem type.
- Score: 2.44755919161855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stock market and cryptocurrency forecasting is very important to investors as
they aspire to achieve even the slightest improvement to their buy or hold
strategies so that they may increase profitability. However, obtaining accurate
and reliable predictions is challenging, noting that accuracy does not equate
to reliability, especially when financial time-series forecasting is applied
owing to its complex and chaotic tendencies. To mitigate this complexity, this
study provides a comprehensive method for forecasting financial time series
based on tactical input output feature mapping techniques using machine
learning (ML) models. During the prediction process, selecting the relevant
indicators is vital to obtaining the desired results. In the financial field,
limited attention has been paid to this problem with ML solutions. We
investigate the use of feature selection with annealing (FSA) for the first
time in this field, and we apply the least absolute shrinkage and selection
operator (Lasso) method to select the features from more than 1,000 candidates
obtained from 26 technical classifiers with different periods and lags. Boruta
(BOR) feature selection, a wrapper method, is used as a baseline for
comparison. Logistic regression (LR), extreme gradient boosting (XGBoost), and
long short-term memory (LSTM) are then applied to the selected features for
forecasting purposes using 10 different financial datasets containing
cryptocurrencies and stocks. The dependent variables consisted of daily
logarithmic returns and trends. The mean-squared error for regression, area
under the receiver operating characteristic curve, and classification accuracy
were used to evaluate model performance, and the statistical significance of
the forecasting results was tested using paired t-tests. Experiments indicate
that the FSA algorithm increased the performance of ML models, regardless of
problem type.
Related papers
- F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - Enhancing Mean-Reverting Time Series Prediction with Gaussian Processes:
Functional and Augmented Data Structures in Financial Forecasting [0.0]
We explore the application of Gaussian Processes (GPs) for predicting mean-reverting time series with an underlying structure.
GPs offer the potential to forecast not just the average prediction but the entire probability distribution over a future trajectory.
This is particularly beneficial in financial contexts, where accurate predictions alone may not suffice if incorrect volatility assessments lead to capital losses.
arXiv Detail & Related papers (2024-02-23T06:09:45Z) - RF+clust for Leave-One-Problem-Out Performance Prediction [0.9281671380673306]
We study leave-one-problem-out (LOPO) performance prediction.
We analyze whether standard random forest (RF) model predictions can be improved by calibrating them with a weighted average of performance values.
arXiv Detail & Related papers (2023-01-23T16:14:59Z) - Statistics and Deep Learning-based Hybrid Model for Interpretable
Anomaly Detection [0.0]
Hybrid methods have been shown to outperform pure statistical and pure deep learning methods at both forecasting tasks.
MES-LSTM is an interpretable anomaly detection model that overcomes these challenges.
arXiv Detail & Related papers (2022-02-25T14:17:03Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Uncertainty-aware Remaining Useful Life predictor [57.74855412811814]
Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset can be expected to operate.
In this work, we consider Deep Gaussian Processes (DGPs) as possible solutions to the aforementioned limitations.
The performance of the algorithms is evaluated on the N-CMAPSS dataset from NASA for aircraft engines.
arXiv Detail & Related papers (2021-04-08T08:50:44Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - DoubleEnsemble: A New Ensemble Method Based on Sample Reweighting and
Feature Selection for Financial Data Analysis [22.035287788330663]
We propose DoubleEnsemble, an ensemble framework leveraging learning trajectory based sample reweighting and shuffling based feature selection.
Our model is applicable to a wide range of base models, capable of extracting complex patterns, while mitigating the overfitting and instability issues for financial market prediction.
arXiv Detail & Related papers (2020-10-03T02:57:10Z) - Learning low-frequency temporal patterns for quantitative trading [0.0]
We consider the viability of a modularised online machine learning framework to learn signals in low-frequency financial time series data.
The framework is proved on daily sampled closing time-series data from JSE equity markets.
arXiv Detail & Related papers (2020-08-12T11:59:15Z) - Deep Stock Predictions [58.720142291102135]
We consider the design of a trading strategy that performs portfolio optimization using Long Short Term Memory (LSTM) neural networks.
We then customize the loss function used to train the LSTM to increase the profit earned.
We find the LSTM model with the customized loss function to have an improved performance in the training bot over a regressive baseline such as ARIMA.
arXiv Detail & Related papers (2020-06-08T23:37:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.