Related papers: Incremental Gaussian Mixture Clustering for Data Streams

Related papers

Robust Categorical Data Clustering Guided by Multi-Granular Competitive Learning [47.32771052588132]
The nested granular cluster effect is prevalent in the implicit discrete distance space of categorical data.<n>We propose a Multi-Granular Competitiveization Learning algorithm to allow potential clusters to interactively tune themselves.<n>It is shown that the proposed MGCPL-guided Categorical Data Clustering approach is competent in exploring the nested distribution of multi-granular clusters.
arXiv Detail & Related papers (2026-01-23T06:33:08Z)
TNStream: Applying Tightest Neighbors to Micro-Clusters to Define Multi-Density Clusters in Streaming Data [1.2016321065590192]
This paper proposes a clustering algorithm based on the novel concept of Tightest Neighbors and introduces a data stream clustering theory based on the Skeleton Set. Based on these theories, this paper develops a new method, TNStream, a fully online algorithm. Experimental results demonstrate its effectiveness in improving clustering quality for multi-density data and validate the proposed data stream clustering theory.
arXiv Detail & Related papers (2025-05-01T07:15:20Z)
Village-Net Clustering: A Rapid approach to Non-linear Unsupervised Clustering of High-Dimensional Data [0.0]
We develop an unsupervised clustering algorithm, we call "Village-Net" The algorithm operates in two phases: first, utilizing K-Means clustering, it divides the dataset into distinct subsets we refer to as "villages" We present extensive benchmarking on extant real-world datasets with known ground-truth labels to showcase its competitive performance.
arXiv Detail & Related papers (2025-01-16T06:56:43Z)
Hierarchical Sparse Representation Clustering for High-Dimensional Data Streams [16.228652652243888]
We propose a hierarchical sparse representation clustering (HSRC) method for clustering high-dimensional data streams. The experimental results obtained on several benchmark datasets demonstrate the effectiveness and robustness of HSRC.
arXiv Detail & Related papers (2024-09-07T03:40:55Z)
Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets. In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem. This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Spatio-Temporal Surrogates for Interaction of a Jet with High Explosives: Part II -- Clustering Extremely High-Dimensional Grid-Based Data [0.0]
In this report, we consider output data from simulations of a jet interacting with high explosives. We show how we can use the randomness of both the random projections, and the choice of initial centroids in k-means clustering, to determine the number of clusters in our data set.
arXiv Detail & Related papers (2023-07-03T23:36:43Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
Hard Regularization to Prevent Deep Online Clustering Collapse without Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed. While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster. We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z)
Research on Efficient Fuzzy Clustering Method Based on Local Fuzzy Granular balls [67.33923111887933]
In this paper, the data is fuzzy iterated using granular-balls, and the membership degree of data only considers the two granular-balls where it is located. The formed fuzzy granular-balls set can use more processing methods in the face of different data scenarios.
arXiv Detail & Related papers (2023-03-07T01:52:55Z)
Inv-SENnet: Invariant Self Expression Network for clustering under biased data [17.25929452126843]
We propose a novel framework for jointly removing unwanted attributes (biases) while learning to cluster data points in individual subspaces. Our experimental result on synthetic and real-world datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-11-13T01:19:06Z)
Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly Types [60.45942774425782]
We introduce anomaly clustering, whose goal is to group data into coherent clusters of anomaly types. This is different from anomaly detection, whose goal is to divide anomalies from normal data. We present a simple yet effective clustering framework using a patch-based pretrained deep embeddings and off-the-shelf clustering methods.
arXiv Detail & Related papers (2021-12-21T23:11:33Z)
Clustering Plotted Data by Image Segmentation [12.443102864446223]
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. In this paper, we present a wholly different way of clustering points in 2-dimensional space, inspired by how humans cluster data. Our approach, Visual Clustering, has several advantages over traditional clustering algorithms.
arXiv Detail & Related papers (2021-10-06T06:19:30Z)
Data Stream Clustering: A Review [0.0]
Clustering is one of the most suitable methods for real-time data stream processing. We review recent data stream clustering algorithms and analyze them in terms of the base clustering technique, computational complexity and clustering accuracy. We indicate popular data stream repositories and datasets, stream processing tools and platforms.
arXiv Detail & Related papers (2020-07-16T20:35:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.