Related papers: Consistent Amortized Clustering via Generative Flow Networks

Consistent Amortized Clustering via Generative Flow Networks

URL: http://arxiv.org/abs/2502.19337v1
Date: Wed, 26 Feb 2025 17:30:52 GMT
Title: Consistent Amortized Clustering via Generative Flow Networks
Authors: Irit Chelly, Roy Uziel, Oren Freifeld, Ari Pakman,
Abstract summary: We introduce GFNCP, a novel framework for amortized clustering.<n> GFNCP is formulated as a Generative Flow Network with a shared energy-based parametrization of policy and reward.<n>We show that the flow matching conditions are equivalent to consistency of the clustering posterior under marginalization.
Score: 6.9158153233702935
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural models for amortized probabilistic clustering yield samples of cluster labels given a set-structured input, while avoiding lengthy Markov chain runs and the need for explicit data likelihoods. Existing methods which label each data point sequentially, like the Neural Clustering Process, often lead to cluster assignments highly dependent on the data order. Alternatively, methods that sequentially create full clusters, do not provide assignment probabilities. In this paper, we introduce GFNCP, a novel framework for amortized clustering. GFNCP is formulated as a Generative Flow Network with a shared energy-based parametrization of policy and reward. We show that the flow matching conditions are equivalent to consistency of the clustering posterior under marginalization, which in turn implies order invariance. GFNCP also outperforms existing methods in clustering performance on both synthetic and real-world data.

Related papers

TNStream: Applying Tightest Neighbors to Micro-Clusters to Define Multi-Density Clusters in Streaming Data [1.2016321065590192]
This paper proposes a clustering algorithm based on the novel concept of Tightest Neighbors and introduces a data stream clustering theory based on the Skeleton Set. Based on these theories, this paper develops a new method, TNStream, a fully online algorithm. Experimental results demonstrate its effectiveness in improving clustering quality for multi-density data and validate the proposed data stream clustering theory.
arXiv Detail & Related papers (2025-05-01T07:15:20Z)
Graph Probability Aggregation Clustering [5.377020739388736]
We propose a graph-based fuzzy clustering algorithm that unifies the global clustering objective function with a local clustering constraint. The entire GPAC framework is formulated as a multi-constrained optimization problem, which can be solved using the Lagrangian method. Experiments conducted on synthetic, real-world, and deep learning datasets demonstrate that GPAC not only exceeds existing state-of-the-art methods in clustering performance but also excels in computational efficiency.
arXiv Detail & Related papers (2025-02-27T09:11:32Z)
Interaction-Aware Gaussian Weighting for Clustered Federated Learning [58.92159838586751]
Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy. We propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution. Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy.
arXiv Detail & Related papers (2025-02-05T16:33:36Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
Deep Embedding Clustering Driven by Sample Stability [16.53706617383543]
We propose a deep embedding clustering algorithm driven by sample stability (DECS) Specifically, we start by constructing the initial feature space with an autoencoder and then learn the cluster-oriented embedding feature constrained by sample stability. The experimental results on five datasets illustrate that the proposed method achieves superior performance compared to state-of-the-art clustering approaches.
arXiv Detail & Related papers (2024-01-29T09:19:49Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [69.15976031704687]
We propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability.<n>IAC maintains an overall computational complexity of $ mathcalO(n, textpolylog(n) $, making it scalable and practical for large-scale problems.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
An Improved Algorithm for Clustered Federated Learning [29.166363192740768]
This paper addresses the dichotomy between heterogeneous models and simultaneous training in Federated Learning (FL) We define a new clustering model for FL based on the (optimal) local models of the users. textttSR-FCA uses a robust learning algorithm within each cluster to exploit simultaneous training and to correct clustering errors.
arXiv Detail & Related papers (2022-10-20T19:14:36Z)
Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC) In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning. For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z)
Robust Hierarchical Clustering for Directed Networks: An Axiomatic Approach [13.406858660972551]
We provide a complete taxonomic characterization of robust hierarchical clustering methods for directed networks. We introduce three practical properties associated with robustness in hierarchical clustering: linear scale preservation, stability, and excisiveness. We also address the implementation of our methods and describe an application to real data.
arXiv Detail & Related papers (2021-08-16T17:28:21Z)
Neural Mixture Models with Expectation-Maximization for End-to-end Deep Clustering [0.8543753708890495]
In this paper, we realize mixture model-based clustering with a neural network. We train the network end-to-end via batch-wise EM iterations where the forward pass acts as the E-step and the backward pass acts as the M-step. Our trained networks outperform single-stage deep clustering methods that still depend on k-means.
arXiv Detail & Related papers (2021-07-06T08:00:58Z)
Very Compact Clusters with Structural Regularization via Similarity and Connectivity [3.779514860341336]
We propose an end-to-end deep clustering algorithm, i.e., Very Compact Clusters (VCC) for the general datasets. Our proposed approach achieves better clustering performance over most of the state-of-the-art clustering methods.
arXiv Detail & Related papers (2021-06-09T23:22:03Z)
You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation. We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one. By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed. We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.