Clustering via Self-Supervised Diffusion
- URL: http://arxiv.org/abs/2507.04283v2
- Date: Tue, 29 Jul 2025 19:29:39 GMT
- Title: Clustering via Self-Supervised Diffusion
- Authors: Roy Uziel, Irit Chelly, Oren Freifeld, Ari Pakman,
- Abstract summary: Clustering via Diffusion (CLUDI) is a self-supervised framework that combines the generative power of diffusion models with pre-trained Vision Transformer features to achieve robust and accurate clustering.<n>CLUDI is trained via a teacher-student paradigm: the teacher uses diffusion-based sampling to produce diverse cluster assignments, which the student refines into stable predictions.
- Score: 6.9158153233702935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models, widely recognized for their success in generative tasks, have not yet been applied to clustering. We introduce Clustering via Diffusion (CLUDI), a self-supervised framework that combines the generative power of diffusion models with pre-trained Vision Transformer features to achieve robust and accurate clustering. CLUDI is trained via a teacher-student paradigm: the teacher uses stochastic diffusion-based sampling to produce diverse cluster assignments, which the student refines into stable predictions. This stochasticity acts as a novel data augmentation strategy, enabling CLUDI to uncover intricate structures in high-dimensional data. Extensive evaluations on challenging datasets demonstrate that CLUDI achieves state-of-the-art performance in unsupervised classification, setting new benchmarks in clustering robustness and adaptability to complex data distributions. Our code is available at https://github.com/BGU-CS-VIL/CLUDI.
Related papers
- Fuzzy Cluster-Aware Contrastive Clustering for Time Series [1.435214708535728]
Traditional unsupervised clustering methods often fail to capture the complex nature of time series data.<n>We propose a fuzzy cluster-aware contrastive clustering framework (FCACC) that jointly optimize representation learning and clustering.<n>Our approach introduces a novel three-view data augmentation strategy to enhance feature extraction by leveraging various characteristics of time series data.
arXiv Detail & Related papers (2025-03-28T07:59:23Z) - Towards Learnable Anchor for Deep Multi-View Clustering [49.767879678193005]
In this paper, we propose the Deep Multi-view Anchor Clustering (DMAC) model that performs clustering in linear time.<n>With the optimal anchors, the full sample graph is calculated to derive a discriminative embedding for clustering.<n>Experiments on several datasets demonstrate superior performance and efficiency of DMAC compared to state-of-the-art competitors.
arXiv Detail & Related papers (2025-03-16T09:38:11Z) - Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy.<n>We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution.<n>Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z) - Hierarchical Clustering for Conditional Diffusion in Image Generation [12.618079575423868]
This paper introduces TreeDiffusion, a deep generative model that conditions Diffusion Models on hierarchical clusters to obtain high-quality, cluster-specific generations.
The proposed pipeline consists of two steps: a VAE-based clustering model that learns the hierarchical structure of the data, and a conditional diffusion model that generates realistic images for each cluster.
arXiv Detail & Related papers (2024-10-22T11:35:36Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Adaptive Self-supervised Robust Clustering for Unstructured Data with Unknown Cluster Number [12.926206811876174]
We introduce a novel self-supervised deep clustering approach tailored for unstructured data, termed Adaptive Self-supervised Robust Clustering (ASRC)
ASRC adaptively learns the graph structure and edge weights to capture both local and global structural information.
ASRC even outperforms methods that rely on prior knowledge of the number of clusters, highlighting its effectiveness in addressing the challenges of clustering unstructured data.
arXiv Detail & Related papers (2024-07-29T15:51:09Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Transferable Deep Clustering Model [14.073783373395196]
We propose a novel transferable deep clustering model that can automatically adapt the cluster centroids according to the distribution of data samples.
Our approach introduces a novel attention-based module that can adapt the centroids by measuring their relationship with samples.
Experimental results on both synthetic and real-world datasets demonstrate the effectiveness and efficiency of our proposed transfer learning framework.
arXiv Detail & Related papers (2023-10-07T23:35:17Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Deep Attention-guided Graph Clustering with Dual Self-supervision [49.040136530379094]
We propose a novel method, namely deep attention-guided graph clustering with dual self-supervision (DAGC)
We develop a dual self-supervision solution consisting of a soft self-supervision strategy with a triplet Kullback-Leibler divergence loss and a hard self-supervision strategy with a pseudo supervision loss.
Our method consistently outperforms state-of-the-art methods on six benchmark datasets.
arXiv Detail & Related papers (2021-11-10T06:53:03Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Deep adaptive fuzzy clustering for evolutionary unsupervised
representation learning [2.8028128734158164]
Cluster assignment of large and complex images is a crucial but challenging task in pattern recognition and computer vision.
We present a novel evolutionary unsupervised learning representation model with iterative optimization.
We jointly fuzzy clustering to the deep reconstruction model, in which fuzzy membership is utilized to represent a clear structure of deep cluster assignments.
arXiv Detail & Related papers (2021-03-31T13:58:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.