Clustering of timed sequences -- Application to the analysis of care pathways
- URL: http://arxiv.org/abs/2404.15379v1
- Date: Tue, 23 Apr 2024 07:16:13 GMT
- Title: Clustering of timed sequences -- Application to the analysis of care pathways
- Authors: Thomas Guyet, Pierre Pinson, Enoal Gesny,
- Abstract summary: Revealing homogeneous groups of care pathways can be achieved through clustering.
The difficulty in clustering care pathways, represented by sequences of timestamped events, lies in defining a semantically appropriate metric and clustering algorithms.
In this article, we adapt two methods developed for time series to time sequences: the drop-DTW metric and the DBA approach for the construction of averaged time sequences.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Improving the future of healthcare starts by better understanding the current actual practices in hospitals. This motivates the objective of discovering typical care pathways from patient data. Revealing homogeneous groups of care pathways can be achieved through clustering. The difficulty in clustering care pathways, represented by sequences of timestamped events, lies in defining a semantically appropriate metric and clustering algorithms. In this article, we adapt two methods developed for time series to time sequences: the drop-DTW metric and the DBA approach for the construction of averaged time sequences. These methods are then applied in clustering algorithms to propose original and sound clustering algorithms for timed sequences. This approach is experimented with and evaluated on synthetic and real use cases.
Related papers
- Evaluation of k-means time series clustering based on z-normalization
and NP-Free [0.5898893619901381]
This paper conducts a thorough performance evaluation of k-means time series clustering on real-world open-source time series datasets.
The evaluation focuses on two distinct normalization techniques: z-normalization and NP-Free.
The primary objective of this paper is to assess the impact of these two normalization techniques on k-means time series clustering in terms of its clustering quality.
arXiv Detail & Related papers (2024-01-28T21:23:13Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data.
We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z) - Fuzzy clustering of ordinal time series based on two novel distances
with economic applications [0.12891210250935145]
Two novel distances between ordinal time series are introduced and used to construct fuzzy clustering procedures.
The resulting clustering algorithms are computationally efficient and able to group series generated from similar processes.
Two specific applications involving economic time series illustrate the usefulness of the proposed approaches.
arXiv Detail & Related papers (2023-04-24T16:39:22Z) - Efficient Approximate Kernel Based Spike Sequence Classification [56.2938724367661]
Machine learning models, such as SVM, require a definition of distance/similarity between pairs of sequences.
Exact methods yield better classification performance, but they pose high computational costs.
We propose a series of ways to improve the performance of the approximate kernel in order to enhance its predictive performance.
arXiv Detail & Related papers (2022-09-11T22:44:19Z) - Early Time-Series Classification Algorithms: An Empirical Comparison [59.82930053437851]
Early Time-Series Classification (ETSC) is the task of predicting the class of incoming time-series by observing as few measurements as possible.
We evaluate six existing ETSC algorithms on publicly available data, as well as on two newly introduced datasets.
arXiv Detail & Related papers (2022-03-03T10:43:56Z) - Scalable Intervention Target Estimation in Linear Models [52.60799340056917]
Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets.
This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets.
The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class.
arXiv Detail & Related papers (2021-11-15T03:16:56Z) - Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data.
Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z) - SOMTimeS: Self Organizing Maps for Time Series Clustering and its
Application to Serious Illness Conversations [3.2689702143620147]
We present a new DTW-based clustering method called SOMTimeS (a Self-Organizing Map for TIME Series)
It scales better and runs faster than other DTW-based clustering algorithms, and has similar performance accuracy.
We applied SOMtimeS to natural language conversation data collected as part of a large healthcare cohort study.
arXiv Detail & Related papers (2021-08-26T00:18:25Z) - COHORTNEY: Deep Clustering for Heterogeneous Event Sequences [9.811178291117496]
Clustering of event sequences is widely applicable in domains such as healthcare, marketing, and finance.
We propose COHORTNEY as a novel deep learning method for clustering heterogeneous event sequences.
Our results show that COHORTNEY vastly outperforms in speed and cluster quality the state-of-the-art algorithm for clustering event sequences.
arXiv Detail & Related papers (2021-04-03T16:12:21Z) - Autoencoder-based time series clustering with energy applications [0.0]
Time series clustering is a challenging task due to the specific nature of the data.
In this paper we investigate the combination of a convolutional autoencoder and a k-medoids algorithm to perfom time series clustering.
arXiv Detail & Related papers (2020-02-10T10:04:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.