Optimal Stopping with Gaussian Processes
- URL: http://arxiv.org/abs/2209.14738v1
- Date: Thu, 22 Sep 2022 21:19:27 GMT
- Title: Optimal Stopping with Gaussian Processes
- Authors: Kshama Dwarakanath, Danial Dervovic, Peyman Tavallali, Svitlana S
Vyetrenko, Tucker Balch
- Abstract summary: We show that structural properties commonly exhibited by financial time series allow the use of Gaussian and Deep Gaussian Process models.
We additionally quantify uncertainty in the value function by propagating the price model through the optimal stopping analysis.
We show that our family of algorithms outperforms benchmarks on three historical time series datasets.
- Score: 2.126171264016785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel group of Gaussian Process based algorithms for fast
approximate optimal stopping of time series with specific applications to
financial markets. We show that structural properties commonly exhibited by
financial time series (e.g., the tendency to mean-revert) allow the use of
Gaussian and Deep Gaussian Process models that further enable us to
analytically evaluate optimal stopping value functions and policies. We
additionally quantify uncertainty in the value function by propagating the
price model through the optimal stopping analysis. We compare and contrast our
proposed methods against a sampling-based method, as well as a deep learning
based benchmark that is currently considered the state-of-the-art in the
literature. We show that our family of algorithms outperforms benchmarks on
three historical time series datasets that include intra-day and end-of-day
equity asset prices as well as the daily US treasury yield curve rates.
Related papers
- Efficient Learning of POMDPs with Known Observation Model in Average-Reward Setting [56.92178753201331]
We propose the Observation-Aware Spectral (OAS) estimation technique, which enables the POMDP parameters to be learned from samples collected using a belief-based policy.
We show the consistency of the OAS procedure, and we prove a regret guarantee of order $mathcalO(sqrtT log(T)$ for the proposed OAS-UCRL algorithm.
arXiv Detail & Related papers (2024-10-02T08:46:34Z) - Loss Shaping Constraints for Long-Term Time Series Forecasting [79.3533114027664]
We present a Constrained Learning approach for long-term time series forecasting that respects a user-defined upper bound on the loss at each time-step.
We propose a practical Primal-Dual algorithm to tackle it, and aims to demonstrate that it exhibits competitive average performance in time series benchmarks, while shaping the errors across the predicted window.
arXiv Detail & Related papers (2024-02-14T18:20:44Z) - Rate-Optimal Policy Optimization for Linear Markov Decision Processes [65.5958446762678]
We obtain rate-optimal $widetilde O (sqrt K)$ regret where $K$ denotes the number of episodes.
Our work is the first to establish the optimal (w.r.t.$K$) rate of convergence in the setting with bandit feedback.
No algorithm with an optimal rate guarantee is currently known.
arXiv Detail & Related papers (2023-08-28T15:16:09Z) - $K$-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic
Control [0.6906005491572401]
We propose a novel $K$-nearest neighbor reparametric procedure for estimating the performance of a policy from historical data.
Our analysis allows for the sampling of entire episodes, as is common practice in most applications.
Compared to other OPE methods, our algorithm does not require optimization, can be efficiently implemented via tree-based nearest neighbor search and parallelization, and does not explicitly assume a parametric model for the environment's dynamics.
arXiv Detail & Related papers (2023-06-07T23:55:12Z) - Autoregressive Bandits [58.46584210388307]
We propose a novel online learning setting, Autoregressive Bandits, in which the observed reward is governed by an autoregressive process of order $k$.
We show that, under mild assumptions on the reward process, the optimal policy can be conveniently computed.
We then devise a new optimistic regret minimization algorithm, namely, AutoRegressive Upper Confidence Bound (AR-UCB), that suffers sublinear regret of order $widetildemathcalO left( frac(k+1)3/2sqrtnT (1-G
arXiv Detail & Related papers (2022-12-12T21:37:36Z) - Adversarial Robustness Guarantees for Gaussian Processes [22.403365399119107]
Gaussian processes (GPs) enable principled computation of model uncertainty, making them attractive for safety-critical applications.
We present a framework to analyse adversarial robustness of GPs, defined as invariance of the model's decision to bounded perturbations.
We develop a branch-and-bound scheme to refine the bounds and show, for any $epsilon > 0$, that our algorithm is guaranteed to converge to values $epsilon$-close to the actual values in finitely many iterations.
arXiv Detail & Related papers (2021-04-07T15:14:56Z) - Interpretable ML-driven Strategy for Automated Trading Pattern
Extraction [2.7910505923792646]
We propose a volume-based data pre-processing method for financial time series analysis.
We use a statistical approach for assessing the performance of the method.
Our analysis shows that the proposed method allows successful classification of the financial time series patterns.
arXiv Detail & Related papers (2021-03-23T09:55:46Z) - Sparse Algorithms for Markovian Gaussian Processes [18.999495374836584]
Sparse Markovian processes combine the use of inducing variables with efficient Kalman filter-likes recursion.
We derive a general site-based approach to approximate the non-Gaussian likelihood with local Gaussian terms, called sites.
Our approach results in a suite of novel sparse extensions to algorithms from both the machine learning and signal processing, including variational inference, expectation propagation, and the classical nonlinear Kalman smoothers.
The derived methods are suited to literature-temporal data, where the model has separate inducing points in both time and space.
arXiv Detail & Related papers (2021-03-19T09:50:53Z) - CoinDICE: Off-Policy Confidence Interval Estimation [107.86876722777535]
We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning.
We show in a variety of benchmarks that the confidence interval estimates are tighter and more accurate than existing methods.
arXiv Detail & Related papers (2020-10-22T12:39:11Z) - Early Classification of Time Series. Cost-based Optimization Criterion
and Algorithms [0.0]
In this paper, we put forward a new optimization criterion which takes into account both the cost of misclassification and the cost of delaying the decision.
We derived a family of non-myopic algorithms which try to anticipate the expected future gain in information in balance with the cost of waiting.
arXiv Detail & Related papers (2020-05-20T10:08:30Z) - Time-varying Gaussian Process Bandit Optimization with Non-constant
Evaluation Time [93.6788993843846]
We propose a novel time-varying Bayesian optimization algorithm that can effectively handle the non-constant evaluation time.
Our bound elucidates that a pattern of the evaluation time sequence can hugely affect the difficulty of the problem.
arXiv Detail & Related papers (2020-03-10T13:28:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.