SOMTimeS: Self Organizing Maps for Time Series Clustering and its
Application to Serious Illness Conversations
- URL: http://arxiv.org/abs/2108.11523v1
- Date: Thu, 26 Aug 2021 00:18:25 GMT
- Title: SOMTimeS: Self Organizing Maps for Time Series Clustering and its
Application to Serious Illness Conversations
- Authors: Ali Javed, Donna M. Rizzo, Byung Suk Lee, Robert Gramling
- Abstract summary: We present a new DTW-based clustering method called SOMTimeS (a Self-Organizing Map for TIME Series)
It scales better and runs faster than other DTW-based clustering algorithms, and has similar performance accuracy.
We applied SOMtimeS to natural language conversation data collected as part of a large healthcare cohort study.
- Score: 3.2689702143620147
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is an increasing demand for scalable algorithms capable of clustering
and analyzing large time series datasets. The Kohonen self-organizing map (SOM)
is a type of unsupervised artificial neural network for visualizing and
clustering complex data, reducing the dimensionality of data, and selecting
influential features. Like all clustering methods, the SOM requires a measure
of similarity between input data (in this work time series). Dynamic time
warping (DTW) is one such measure, and a top performer given that it
accommodates the distortions when aligning time series. Despite its use in
clustering, DTW is limited in practice because it is quadratic in runtime
complexity with the length of the time series data. To address this, we present
a new DTW-based clustering method, called SOMTimeS (a Self-Organizing Map for
TIME Series), that scales better and runs faster than other DTW-based
clustering algorithms, and has similar performance accuracy. The computational
performance of SOMTimeS stems from its ability to prune unnecessary DTW
computations during the SOM's training phase. We also implemented a similar
pruning strategy for K-means for comparison with one of the top performing
clustering algorithms. We evaluated the pruning effectiveness, accuracy,
execution time and scalability on 112 benchmark time series datasets from the
University of California, Riverside classification archive. We showed that for
similar accuracy, the speed-up achieved for SOMTimeS and K-means was 1.8x on
average; however, rates varied between 1x and 18x depending on the dataset.
SOMTimeS and K-means pruned 43% and 50% of the total DTW computations,
respectively. We applied SOMtimeS to natural language conversation data
collected as part of a large healthcare cohort study of patient-clinician
serious illness conversations to demonstrate the algorithm's utility with
complex, temporally sequenced phenomena.
Related papers
- Concrete Dense Network for Long-Sequence Time Series Clustering [4.307648859471193]
Time series clustering is fundamental in data analysis for discovering temporal patterns.
Deep temporal clustering methods have been trying to integrate the canonical k-means into end-to-end training of neural networks.
LoSTer is a novel dense autoencoder architecture for the long-sequence time series clustering problem.
arXiv Detail & Related papers (2024-05-08T12:31:35Z) - Clustering of timed sequences -- Application to the analysis of care pathways [0.0]
Revealing typical care pathways can be achieved through clustering.
The difficulty in clustering care pathways, represented by sequences of timestamped events, lies in defining a semantically appropriate metric and clustering algorithms.
arXiv Detail & Related papers (2024-04-23T07:16:13Z) - Fast dynamic time warping and clustering in C++ [0.0]
We present an approach for computationally efficient dynamic time warping (DTW) and clustering of time-series data.
The method frames the dynamic warping of time series datasets as an optimisation problem solved using dynamic programming.
There is also an option to use k-medoids clustering for increased speed, when a certificate for global optimality is not essential.
arXiv Detail & Related papers (2023-07-10T21:08:27Z) - OTW: Optimal Transport Warping for Time Series [75.69837166816501]
Dynamic Time Warping (DTW) has become the pragmatic choice for measuring distance between time series.
It suffers from unavoidable quadratic time complexity when the optimal alignment matrix needs to be computed exactly.
We introduce a new metric for time series data based on the Optimal Transport framework, called Optimal Transport Warping (OTW)
arXiv Detail & Related papers (2023-06-01T12:45:00Z) - Clustering Method for Time-Series Images Using Quantum-Inspired
Computing Technology [0.0]
Time-series clustering serves as a powerful data mining technique for time-series data in the absence of prior knowledge about clusters.
This study proposes a novel time-series clustering method that leverages an annealing machine.
arXiv Detail & Related papers (2023-05-26T05:58:14Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Adaptively-weighted Integral Space for Fast Multiview Clustering [54.177846260063966]
We propose an Adaptively-weighted Integral Space for Fast Multiview Clustering (AIMC) with nearly linear complexity.
Specifically, view generation models are designed to reconstruct the view observations from the latent integral space.
Experiments conducted on several realworld datasets confirm the superiority of the proposed AIMC method.
arXiv Detail & Related papers (2022-08-25T05:47:39Z) - Towards Similarity-Aware Time-Series Classification [51.2400839966489]
We study time-series classification (TSC), a fundamental task of time-series data mining.
We propose Similarity-Aware Time-Series Classification (SimTSC), a framework that models similarity information with graph neural networks (GNNs)
arXiv Detail & Related papers (2022-01-05T02:14:57Z) - Cluster-and-Conquer: A Framework For Time-Series Forecasting [94.63501563413725]
We propose a three-stage framework for forecasting high-dimensional time-series data.
Our framework is highly general, allowing for any time-series forecasting and clustering method to be used in each step.
When instantiated with simple linear autoregressive models, we are able to achieve state-of-the-art results on several benchmark datasets.
arXiv Detail & Related papers (2021-10-26T20:41:19Z) - Hierarchical Clustering using Auto-encoded Compact Representation for
Time-series Analysis [8.660029077292346]
We propose a novel mechanism to identify the clusters combining learned compact representation of time-series, Auto Encoded Compact Sequence (AECS) and hierarchical clustering approach.
Our algorithm exploits Recurrent Neural Network (RNN) based under complete Sequence to Sequence(seq2seq) autoencoder and agglomerative hierarchical clustering.
arXiv Detail & Related papers (2021-01-11T08:03:57Z) - (k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time
Warping [57.316437798033974]
In this work we consider the problem of center-based clustering of trajectories.
We propose the usage of a continuous version of DTW as distance measure, which we call continuous dynamic time warping (CDTW)
We show a practical way to compute a center from a set of trajectories and subsequently iteratively improve it.
arXiv Detail & Related papers (2020-12-01T13:17:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.