Similarity metrics for Different Market Scenarios in Abides
- URL: http://arxiv.org/abs/2107.09352v1
- Date: Tue, 20 Jul 2021 09:18:06 GMT
- Title: Similarity metrics for Different Market Scenarios in Abides
- Authors: Diego Pino, Javier Garc\'ia, Fernando Fern\'andez, Svitlana S
Vyetrenko
- Abstract summary: Markov Decision Processes (MDPs) are an effective way to formally describe many Machine Learning problems.
This paper analyzes the use of three similarity metrics based on conceptual, structural and performance aspects of the financial MDPs.
- Score: 58.720142291102135
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Markov Decision Processes (MDPs) are an effective way to formally describe
many Machine Learning problems. In fact, recently MDPs have also emerged as a
powerful framework to model financial trading tasks. For example, financial
MDPs can model different market scenarios. However, the learning of a
(near-)optimal policy for each of these financial MDPs can be a very
time-consuming process, especially when nothing is known about the policy to
begin with. An alternative approach is to find a similar financial MDP for
which we have already learned its policy, and then reuse such policy in the
learning of a new policy for a new financial MDP. Such a knowledge transfer
between market scenarios raises several issues. On the one hand, how to measure
the similarity between financial MDPs. On the other hand, how to use this
similarity measurement to effectively transfer the knowledge between financial
MDPs. This paper addresses both of these issues. Regarding the first one, this
paper analyzes the use of three similarity metrics based on conceptual,
structural and performance aspects of the financial MDPs. Regarding the second
one, this paper uses Probabilistic Policy Reuse to balance the
exploitation/exploration in the learning of a new financial MDP according to
the similarity of the previous financial MDPs whose knowledge is reused.
Related papers
- MDP Geometry, Normalization and Value Free Solvers [15.627546283580166]
The Markov Decision Process (MDP) is a widely used mathematical model for sequential decision-making problems.
We show that MDPs can be divided into equivalence classes with indistinguishable key solving algorithms dynamics.
arXiv Detail & Related papers (2024-07-09T09:39:45Z) - Near-Optimal Learning and Planning in Separated Latent MDPs [70.88315649628251]
We study computational and statistical aspects of learning Latent Markov Decision Processes (LMDPs)
In this model, the learner interacts with an MDP drawn at the beginning of each epoch from an unknown mixture of MDPs.
arXiv Detail & Related papers (2024-06-12T06:41:47Z) - Beyond Surface Similarity: Detecting Subtle Semantic Shifts in Financial Narratives [19.574432889355627]
We introduce the Financial-STS task, a financial domain-specific NLP task designed to measure the nuanced semantic similarity between pairs of financial narratives.
Measuring the subtle semantic differences between these paired narratives enables market stakeholders to gauge changes over time in the company's financial and operational situations.
arXiv Detail & Related papers (2024-03-21T12:17:59Z) - A Theoretical Analysis of Optimistic Proximal Policy Optimization in
Linear Markov Decision Processes [13.466249082564213]
We propose an optimistic variant of PPO for episodic adversarial linear MDPs with full-information feedback.
Compared with existing policy-based algorithms, we achieve the state-of-the-art regret bound in both linear MDPs and adversarial linear MDPs with full information.
arXiv Detail & Related papers (2023-05-15T17:55:24Z) - Policy Dispersion in Non-Markovian Environment [53.05904889617441]
This paper tries to learn the diverse policies from the history of state-action pairs under a non-Markovian environment.
We first adopt a transformer-based method to learn policy embeddings.
Then, we stack the policy embeddings to construct a dispersion matrix to induce a set of diverse policies.
arXiv Detail & Related papers (2023-02-28T11:58:39Z) - Robust Anytime Learning of Markov Decision Processes [8.799182983019557]
In data-driven applications, deriving precise probabilities from limited data introduces statistical errors.
Uncertain MDPs (uMDPs) do not require precise probabilities but instead use so-called uncertainty sets in the transitions.
We propose a robust anytime-learning approach that combines a dedicated Bayesian inference scheme with the computation of robust policies.
arXiv Detail & Related papers (2022-05-31T14:29:55Z) - Bridging the gap between QP-based and MPC-based RL [1.90365714903665]
We approximate the policy and value functions using an optimization problem, taking the form of Quadratic Programs (QPs)
A generic unstructured QP offers high flexibility for learning, while a QP having the structure of an MPC scheme promotes the explainability of the resulting policy.
We illustrate the workings of our proposed method with the resulting structure using a point-mass task.
arXiv Detail & Related papers (2022-05-18T10:41:18Z) - Safe Exploration by Solving Early Terminated MDP [77.10563395197045]
We introduce a new approach to address safe RL problems under the framework of Early TerminatedP (ET-MDP)
We first define the ET-MDP as an unconstrained algorithm with the same optimal value function as its corresponding CMDP.
An off-policy algorithm based on context models is then proposed to solve the ET-MDP, which thereby solves the corresponding CMDP with better performance and improved learning efficiency.
arXiv Detail & Related papers (2021-07-09T04:24:40Z) - Exploration-Exploitation in Constrained MDPs [79.23623305214275]
We investigate the exploration-exploitation dilemma in Constrained Markov Decision Processes (CMDPs)
While learning in an unknown CMDP, an agent should trade-off exploration to discover new information about the MDP.
While the agent will eventually learn a good or optimal policy, we do not want the agent to violate the constraints too often during the learning process.
arXiv Detail & Related papers (2020-03-04T17:03:56Z) - Gaussian process imputation of multiple financial series [71.08576457371433]
Multiple time series such as financial indicators, stock prices and exchange rates are strongly coupled due to their dependence on the latent state of the market.
We focus on learning the relationships among financial time series by modelling them through a multi-output Gaussian process.
arXiv Detail & Related papers (2020-02-11T19:18:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.