DiEC: Diffusion Embedded Clustering
- URL: http://arxiv.org/abs/2512.20905v2
- Date: Thu, 25 Dec 2025 16:22:07 GMT
- Title: DiEC: Diffusion Embedded Clustering
- Authors: Haidong Hu,
- Abstract summary: Deep clustering critically depends on representations that expose clear cluster structure.<n>Most prior methods learn a single embedding with an autoencoder or a self-supervised encoder and treat it as the primary representation for clustering.<n>We propose Embedded Diffusion Clustering (DiEC), an unsupervised clustering framework that exploits this trajectory by directly leveraging intermediate activations of a pretrained diffusion U-Net.
- Score: 0.76629754443761
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep clustering critically depends on representations that expose clear cluster structure, yet most prior methods learn a single embedding with an autoencoder or a self-supervised encoder and treat it as the primary representation for clustering. In contrast, a pretrained diffusion model induces a rich representation trajectory over network layers and noise timesteps, along which clusterability varies substantially. We propose Diffusion Embedded Clustering (DiEC), an unsupervised clustering framework that exploits this trajectory by directly leveraging intermediate activations of a pretrained diffusion U-Net. DiEC formulates representation selection over layer * timestep and adopts a practical two-stage procedure: it uses the U-Net bottleneck as the Clustering Middle Layer (CML, l*) and identifies the Clustering-Optimal Timestep (COT, t*) via an efficient subset-based, noise-averaged search. Conditioning on (l*, t*), DiEC learns clustering embeddings through a lightweight residual mapping, optimized with a DEC-style KL self-training objective and structural regularization, while a parallel random-timestep denoising-consistency loss stabilizes training and preserves diffusion behavior. Experiments on standard benchmarks demonstrate that DiEC achieves strong clustering performance and reveal the importance of selecting diffusion representations for clustering.
Related papers
- You Can Trust Your Clustering Model: A Parameter-free Self-Boosting Plug-in for Deep Clustering [73.48306836608124]
DCBoost is a parameter-free plug-in designed to enhance the global feature structures of current deep clustering models.<n>By harnessing reliable local structural cues, our method aims to elevate clustering performance effectively.
arXiv Detail & Related papers (2025-11-26T09:16:36Z) - Self-Enhanced Image Clustering with Cross-Modal Semantic Consistency [57.961869351897384]
We propose a framework based on cross-modal semantic consistency for efficient image clustering.<n>Our framework first builds a strong foundation via Cross-Modal Semantic Consistency.<n>In the first stage, we train lightweight clustering heads to align with the rich semantics of the pre-trained model.<n>In the second stage, we introduce a Self-Enhanced fine-tuning strategy.
arXiv Detail & Related papers (2025-08-02T08:12:57Z) - Clustering via Self-Supervised Diffusion [6.9158153233702935]
Clustering via Diffusion (CLUDI) is a self-supervised framework that combines the generative power of diffusion models with pre-trained Vision Transformer features to achieve robust and accurate clustering.<n>CLUDI is trained via a teacher-student paradigm: the teacher uses diffusion-based sampling to produce diverse cluster assignments, which the student refines into stable predictions.
arXiv Detail & Related papers (2025-07-06T07:57:08Z) - Scalable Context-Preserving Model-Aware Deep Clustering for Hyperspectral Images [51.95768218975529]
Subspace clustering has become widely adopted for the unsupervised analysis of hyperspectral images (HSIs)<n>Recent model-aware deep subspace clustering methods often use a two-stage framework, involving the calculation of a self-representation matrix with complexity of O(n2), followed by spectral clustering.<n>We propose a scalable, context-preserving deep clustering method based on basis representation, which jointly captures local and non-local structures for efficient HSI clustering.
arXiv Detail & Related papers (2025-06-12T16:43:09Z) - Fuzzy Cluster-Aware Contrastive Clustering for Time Series [1.435214708535728]
Traditional unsupervised clustering methods often fail to capture the complex nature of time series data.<n>We propose a fuzzy cluster-aware contrastive clustering framework (FCACC) that jointly optimize representation learning and clustering.<n>Our approach introduces a novel three-view data augmentation strategy to enhance feature extraction by leveraging various characteristics of time series data.
arXiv Detail & Related papers (2025-03-28T07:59:23Z) - Towards Learnable Anchor for Deep Multi-View Clustering [49.767879678193005]
In this paper, we propose the Deep Multi-view Anchor Clustering (DMAC) model that performs clustering in linear time.<n>With the optimal anchors, the full sample graph is calculated to derive a discriminative embedding for clustering.<n>Experiments on several datasets demonstrate superior performance and efficiency of DMAC compared to state-of-the-art competitors.
arXiv Detail & Related papers (2025-03-16T09:38:11Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Deep Embedding Clustering Driven by Sample Stability [16.53706617383543]
We propose a deep embedding clustering algorithm driven by sample stability (DECS)
Specifically, we start by constructing the initial feature space with an autoencoder and then learn the cluster-oriented embedding feature constrained by sample stability.
The experimental results on five datasets illustrate that the proposed method achieves superior performance compared to state-of-the-art clustering approaches.
arXiv Detail & Related papers (2024-01-29T09:19:49Z) - End-to-end Learnable Clustering for Intent Learning in Recommendation [54.157784572994316]
We propose a novel intent learning method termed underlineELCRec.
It unifies behavior representation learning into an underlineEnd-to-end underlineLearnable underlineClustering framework.
We deploy this method on the industrial recommendation system with 130 million page views and achieve promising results.
arXiv Detail & Related papers (2024-01-11T15:22:55Z) - Deep Temporal Contrastive Clustering [21.660509622172274]
This paper presents a deep temporal contrastive clustering approach.
It incorporates the contrastive learning paradigm into the deep time series clustering research.
Experiments on a variety of time series datasets demonstrate the superiority of our approach over the state-of-the-art.
arXiv Detail & Related papers (2022-12-29T16:43:34Z) - Very Compact Clusters with Structural Regularization via Similarity and
Connectivity [3.779514860341336]
We propose an end-to-end deep clustering algorithm, i.e., Very Compact Clusters (VCC) for the general datasets.
Our proposed approach achieves better clustering performance over most of the state-of-the-art clustering methods.
arXiv Detail & Related papers (2021-06-09T23:22:03Z) - Hierarchical Clustering using Auto-encoded Compact Representation for
Time-series Analysis [8.660029077292346]
We propose a novel mechanism to identify the clusters combining learned compact representation of time-series, Auto Encoded Compact Sequence (AECS) and hierarchical clustering approach.
Our algorithm exploits Recurrent Neural Network (RNN) based under complete Sequence to Sequence(seq2seq) autoencoder and agglomerative hierarchical clustering.
arXiv Detail & Related papers (2021-01-11T08:03:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.