Clustering Market Regimes using the Wasserstein Distance
- URL: http://arxiv.org/abs/2110.11848v1
- Date: Fri, 22 Oct 2021 15:27:52 GMT
- Title: Clustering Market Regimes using the Wasserstein Distance
- Authors: Blanka Horvath, Zacharia Issa, Aitor Muguruza
- Abstract summary: We outline an unsupervised learning algorithm for clustering financial time-series into a suitable number of temporal segments.
We develop a robust algorithm that automates the process of classifying market regimes.
In both cases it is shown that the WK-means algorithm vastly outperforms all considered competitor approaches.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The problem of rapid and automated detection of distinct market regimes is a
topic of great interest to financial mathematicians and practitioners alike. In
this paper, we outline an unsupervised learning algorithm for clustering
financial time-series into a suitable number of temporal segments (market
regimes). As a special case of the above, we develop a robust algorithm that
automates the process of classifying market regimes. The method is robust in
the sense that it does not depend on modelling assumptions of the underlying
time series as our experiments with real datasets show. This method -- dubbed
the Wasserstein $k$-means algorithm -- frames such a problem as one on the
space of probability measures with finite $p^\text{th}$ moment, in terms of the
$p$-Wasserstein distance between (empirical) distributions. We compare our
WK-means approach with a more traditional clustering algorithms by studying the
so-called maximum mean discrepancy scores between, and within clusters. In both
cases it is shown that the WK-means algorithm vastly outperforms all considered
competitor approaches. We demonstrate the performance of all approaches both in
a controlled environment on synthetic data, and on real data.
Related papers
- Geometric Median (GM) Matching for Robust Data Pruning [29.458270105150564]
Data pruning is crucial for mitigating the enormous computational costs associated with training data-hungry models at scale.
In this work, we propose Geometric ($gm$) Matching that yields a $k$-subset such that the mean of the subset approximates the median of the (potentially) noisy dataset.
Experiments across popular deep learning benchmarks indicate that $gm$ Matching consistently outperforms prior state-of-the-art.
arXiv Detail & Related papers (2024-06-25T00:02:01Z) - Automated regime detection in multidimensional time series data using
sliced Wasserstein k-means clustering [0.0]
We study the behaviour of the Wasserstein k-means clustering algorithm applied to time series data.
We extend the technique to multidimensional time series data by approximating the multidimensional Wasserstein distance as a sliced Wasserstein distance.
We show that the sWk-means method is effective in identifying distinct market regimes in real multidimensional financial time series.
arXiv Detail & Related papers (2023-10-02T15:37:56Z) - Differentially-Private Hierarchical Clustering with Provable
Approximation Guarantees [79.59010418610625]
We study differentially private approximation algorithms for hierarchical clustering.
We show strong lower bounds for the problem: that any $epsilon$-DP algorithm must exhibit $O(|V|2/ epsilon)$-additive error for an input dataset.
We propose a private $1+o(1)$ approximation algorithm which also recovers the blocks exactly.
arXiv Detail & Related papers (2023-01-31T19:14:30Z) - Scalable Differentially Private Clustering via Hierarchically Separated
Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee.
Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z) - Finite-Time Analysis of Fully Decentralized Single-Timescale
Actor-Critic [4.94128206910124]
We introduce a fully decentralized Actor-Critic (AC) algorithm, where actor, critic, and global reward estimator are updated in an alternating manner.
We show that our algorithm has sample complexity of $tildemathcalO(epsilon-2)$ under Markovian sampling.
We also provide a local action privacy-preserving version of our algorithm and its analysis.
arXiv Detail & Related papers (2022-06-12T13:14:14Z) - Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback.
It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines.
We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z) - Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime.
In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem.
We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z) - Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem.
We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.