Measures of Overlapping Multivariate Gaussian Clusters in Unsupervised Online Learning
- URL: http://arxiv.org/abs/2508.15444v1
- Date: Thu, 21 Aug 2025 11:06:02 GMT
- Title: Measures of Overlapping Multivariate Gaussian Clusters in Unsupervised Online Learning
- Authors: Miha Ožbot, Igor Škrjanc,
- Abstract summary: The aim of online learning from data streams is to create clustering, classification, or regression models that can adapt over time.<n>In the case of clustering, this can result in a large number of clusters that may overlap and should be merged.<n>Our proposed dissimilarity measure is specifically designed to detect overlap rather than dissimilarity.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a new measure for detecting overlap in multivariate Gaussian clusters. The aim of online learning from data streams is to create clustering, classification, or regression models that can adapt over time based on the conceptual drift of streaming data. In the case of clustering, this can result in a large number of clusters that may overlap and should be merged. Commonly used distribution dissimilarity measures are not adequate for determining overlapping clusters in the context of online learning from streaming data due to their inability to account for all shapes of clusters and their high computational demands. Our proposed dissimilarity measure is specifically designed to detect overlap rather than dissimilarity and can be computed faster compared to existing measures. Our method is several times faster than compared methods and is capable of detecting overlapping clusters while avoiding the merging of orthogonal clusters.
Related papers
- Towards Learnable Anchor for Deep Multi-View Clustering [49.767879678193005]
In this paper, we propose the Deep Multi-view Anchor Clustering (DMAC) model that performs clustering in linear time.<n>With the optimal anchors, the full sample graph is calculated to derive a discriminative embedding for clustering.<n>Experiments on several datasets demonstrate superior performance and efficiency of DMAC compared to state-of-the-art competitors.
arXiv Detail & Related papers (2025-03-16T09:38:11Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly
Types [60.45942774425782]
We introduce anomaly clustering, whose goal is to group data into coherent clusters of anomaly types.
This is different from anomaly detection, whose goal is to divide anomalies from normal data.
We present a simple yet effective clustering framework using a patch-based pretrained deep embeddings and off-the-shelf clustering methods.
arXiv Detail & Related papers (2021-12-21T23:11:33Z) - Learning Statistical Representation with Joint Deep Embedded Clustering [2.1267423178232407]
StatDEC is an unsupervised framework for joint statistical representation learning and clustering.
Our experiments show that using these representations, one can considerably improve results on imbalanced image clustering across a variety of image datasets.
arXiv Detail & Related papers (2021-09-11T09:26:52Z) - Efficient Large-Scale Face Clustering Using an Online Mixture of
Gaussians [1.3101369903953806]
We present an online gaussian mixture-based clustering method (OGMC) for large-scale online face clustering.
Using feature vectors (f-vectors) extracted from the incoming faces, OGMC generates clusters that may be connected to others depending on their proximity and robustness.
Experimental results show that the proposed approach outperforms state-of-the-art clustering methods on large-scale face clustering benchmarks.
arXiv Detail & Related papers (2021-03-31T17:59:38Z) - Dynamic Clustering in Federated Learning [15.37652170495055]
We propose a three-phased data clustering algorithm, namely: generative adversarial network-based clustering, cluster calibration, and cluster division.
Our algorithm improves the performance of forecasting models, including cellular network handover, by 43%.
arXiv Detail & Related papers (2020-12-07T15:30:07Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Progressive Cluster Purification for Unsupervised Feature Learning [48.87365358296371]
In unsupervised feature learning, sample specificity based methods ignore the inter-class information.
We propose a novel clustering based method, which excludes class inconsistent samples during progressive cluster formation.
Our approach, referred to as Progressive Cluster Purification (PCP), implements progressive clustering by gradually reducing the number of clusters during training.
arXiv Detail & Related papers (2020-07-06T08:11:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.