Line Space Clustering (LSC): Feature-Based Clustering using K-medians and Dynamic Time Warping for Versatility
- URL: http://arxiv.org/abs/2503.15777v1
- Date: Thu, 20 Mar 2025 01:27:10 GMT
- Title: Line Space Clustering (LSC): Feature-Based Clustering using K-medians and Dynamic Time Warping for Versatility
- Authors: Joanikij Chulev, Angela Mladenovska,
- Abstract summary: Line Space Clustering (LSC) is a representation that transforms data points into lines in a newly defined feature space.<n>LSC employs a combined distance metric that uses Euclidean and Dynamic Time Warping (DTW) distances, weighted by a parameter alpha<n> experiments demonstrate the efficacy of LSC on synthetic and real-world datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Clustering high-dimensional data is a critical challenge in machine learning due to the curse of dimensionality and the presence of noise. Traditional clustering algorithms often fail to capture the intrinsic structures in such data. This paper explores a combination of clustering methods, which we called Line Space Clustering (LSC), a representation that transforms data points into lines in a newly defined feature space, enabling clustering based on the similarity of feature value patterns, essentially treating features as sequences. LSC employs a combined distance metric that uses Euclidean and Dynamic Time Warping (DTW) distances, weighted by a parameter {\alpha}, allowing flexibility in emphasizing shape or magnitude similarities. We delve deeply into the mechanics of DTW and the Savitzky Golay filter, explaining their roles in the algorithm. Extensive experiments demonstrate the efficacy of LSC on synthetic and real-world datasets, showing that randomly experimenting with time-series optimized methods sometimes might surprisingly work on a complex dataset, particularly in noisy environments. Source code and experiments are available at: https://github.com/JoanikijChulev/LSC.
Related papers
- CoHiRF: A Scalable and Interpretable Clustering Framework for High-Dimensional Data [0.30723404270319693]
We propose Consensus Hierarchical Random Feature (CoHiRF), a novel clustering method designed to address challenges effectively.<n>CoHiRF leverages random feature selection to mitigate noise and dimensionality effects, repeatedly applies K-Means clustering in reduced feature spaces, and combines results through a unanimous consensus criterion.<n>CoHiRF is computationally efficient with a running time comparable to K-Means, scalable to massive datasets, and exhibits robust performance against state-of-the-art methods such as SC-SRGF, HDBSCAN, and OPTICS.
arXiv Detail & Related papers (2025-02-01T09:38:44Z) - Concrete Dense Network for Long-Sequence Time Series Clustering [4.307648859471193]
Time series clustering is fundamental in data analysis for discovering temporal patterns.
Deep temporal clustering methods have been trying to integrate the canonical k-means into end-to-end training of neural networks.
LoSTer is a novel dense autoencoder architecture for the long-sequence time series clustering problem.
arXiv Detail & Related papers (2024-05-08T12:31:35Z) - Anchor-based Multi-view Subspace Clustering with Hierarchical Feature Descent [46.86939432189035]
We propose Anchor-based Multi-view Subspace Clustering with Hierarchical Feature Descent.
Our proposed model consistently outperforms the state-of-the-art techniques.
arXiv Detail & Related papers (2023-10-11T03:29:13Z) - Contrastive Continual Multi-view Clustering with Filtered Structural
Fusion [57.193645780552565]
Multi-view clustering thrives in applications where views are collected in advance.
It overlooks scenarios where data views are collected sequentially, i.e., real-time data.
Some methods are proposed to handle it but are trapped in a stability-plasticity dilemma.
We propose Contrastive Continual Multi-view Clustering with Filtered Structural Fusion.
arXiv Detail & Related papers (2023-09-26T14:18:29Z) - Deep Spatiotemporal Clustering: A Temporal Clustering Approach for
Multi-dimensional Climate Data [0.353122873734926]
We propose a novel algorithm for high-dimensional temporal representation of data using an unsupervised deep learning method.
Inspired by U-net architecture, our algorithm utilizes an autoencoder integrating CNN-RNN layers to learn latent representations.
Our experiments show our approach outperforms both conventional and deep learning-based unsupervised clustering algorithms.
arXiv Detail & Related papers (2023-04-27T21:45:21Z) - Spatiotemporal k-means [39.98633724527769]
We propose a twotemporal clustering method called k-means (STk) that is able to analyze multi-scale clusters.
We show how STkM can be extended to more complex machine learning tasks, particularly unsupervised region of interest detection and tracking in videos.
arXiv Detail & Related papers (2022-11-10T04:40:31Z) - Adaptively-weighted Integral Space for Fast Multiview Clustering [54.177846260063966]
We propose an Adaptively-weighted Integral Space for Fast Multiview Clustering (AIMC) with nearly linear complexity.
Specifically, view generation models are designed to reconstruct the view observations from the latent integral space.
Experiments conducted on several realworld datasets confirm the superiority of the proposed AIMC method.
arXiv Detail & Related papers (2022-08-25T05:47:39Z) - SOMTimeS: Self Organizing Maps for Time Series Clustering and its
Application to Serious Illness Conversations [3.2689702143620147]
We present a new DTW-based clustering method called SOMTimeS (a Self-Organizing Map for TIME Series)
It scales better and runs faster than other DTW-based clustering algorithms, and has similar performance accuracy.
We applied SOMtimeS to natural language conversation data collected as part of a large healthcare cohort study.
arXiv Detail & Related papers (2021-08-26T00:18:25Z) - Stable and consistent density-based clustering via multiparameter
persistence [77.34726150561087]
We consider the degree-Rips construction from topological data analysis.
We analyze its stability to perturbations of the input data using the correspondence-interleaving distance.
We integrate these methods into a pipeline for density-based clustering, which we call Persistable.
arXiv Detail & Related papers (2020-05-18T19:45:04Z) - Learnable Subspace Clustering [76.2352740039615]
We develop a learnable subspace clustering paradigm to efficiently solve the large-scale subspace clustering problem.
The key idea is to learn a parametric function to partition the high-dimensional subspaces into their underlying low-dimensional subspaces.
To the best of our knowledge, this paper is the first work to efficiently cluster millions of data points among the subspace clustering methods.
arXiv Detail & Related papers (2020-04-09T12:53:28Z) - ClusterVO: Clustering Moving Instances and Estimating Visual Odometry
for Self and Surroundings [54.33327082243022]
ClusterVO is a stereo Visual Odometry which simultaneously clusters and estimates the motion of both ego and surrounding rigid clusters/objects.
Unlike previous solutions relying on batch input or imposing priors on scene structure or dynamic object models, ClusterVO is online, general and thus can be used in various scenarios including indoor scene understanding and autonomous driving.
arXiv Detail & Related papers (2020-03-29T09:06:28Z) - New advances in enumerative biclustering algorithms with online
partitioning [80.22629846165306]
This paper further extends RIn-Close_CVC, a biclustering algorithm capable of performing an efficient, complete, correct and non-redundant enumeration of maximal biclusters with constant values on columns in numerical datasets.
The improved algorithm is called RIn-Close_CVC3, keeps those attractive properties of RIn-Close_CVC, and is characterized by: a drastic reduction in memory usage; a consistent gain in runtime.
arXiv Detail & Related papers (2020-03-07T14:54:26Z) - Autoencoder-based time series clustering with energy applications [0.0]
Time series clustering is a challenging task due to the specific nature of the data.
In this paper we investigate the combination of a convolutional autoencoder and a k-medoids algorithm to perfom time series clustering.
arXiv Detail & Related papers (2020-02-10T10:04:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.