ACTGNN: Assessment of Clustering Tendency with Synthetically-Trained Graph Neural Networks
- URL: http://arxiv.org/abs/2501.18112v1
- Date: Thu, 30 Jan 2025 03:31:26 GMT
- Title: ACTGNN: Assessment of Clustering Tendency with Synthetically-Trained Graph Neural Networks
- Authors: Yiran Luo, Evangelos E. Papalexakis,
- Abstract summary: ACTGNN is a graph-based framework designed to assess clustering tendency by leveraging graph representations of data.
A Graph Neural Network (GNN) is trained exclusively on synthetic datasets, enabling robust learning of clustering structures.
Our results highlight the generalizability and effectiveness of the proposed approach, making it a promising tool for robust clustering tendency assessment.
- Score: 4.668678950572517
- License:
- Abstract: Determining clustering tendency in datasets is a fundamental but challenging task, especially in noisy or high-dimensional settings where traditional methods, such as the Hopkins Statistic and Visual Assessment of Tendency (VAT), often struggle to produce reliable results. In this paper, we propose ACTGNN, a graph-based framework designed to assess clustering tendency by leveraging graph representations of data. Node features are constructed using Locality-Sensitive Hashing (LSH), which captures local neighborhood information, while edge features incorporate multiple similarity metrics, such as the Radial Basis Function (RBF) kernel, to model pairwise relationships. A Graph Neural Network (GNN) is trained exclusively on synthetic datasets, enabling robust learning of clustering structures under controlled conditions. Extensive experiments demonstrate that ACTGNN significantly outperforms baseline methods on both synthetic and real-world datasets, exhibiting superior performance in detecting faint clustering structures, even in high-dimensional or noisy data. Our results highlight the generalizability and effectiveness of the proposed approach, making it a promising tool for robust clustering tendency assessment.
Related papers
- Graph as a feature: improving node classification with non-neural graph-aware logistic regression [2.952177779219163]
Graph-aware Logistic Regression (GLR) is a non-neural model designed for node classification tasks.
Unlike traditional graph algorithms that use only a fraction of the information accessible to GNNs, our proposed model simultaneously leverages both node features and the relationships between entities.
arXiv Detail & Related papers (2024-11-19T08:32:14Z) - Adaptive Self-supervised Robust Clustering for Unstructured Data with Unknown Cluster Number [12.926206811876174]
We introduce a novel self-supervised deep clustering approach tailored for unstructured data, termed Adaptive Self-supervised Robust Clustering (ASRC)
ASRC adaptively learns the graph structure and edge weights to capture both local and global structural information.
ASRC even outperforms methods that rely on prior knowledge of the number of clusters, highlighting its effectiveness in addressing the challenges of clustering unstructured data.
arXiv Detail & Related papers (2024-07-29T15:51:09Z) - Rethinking the Effectiveness of Graph Classification Datasets in Benchmarks for Assessing GNNs [7.407592553310068]
We propose an empirical protocol based on a fair benchmarking framework to investigate the performance discrepancy between simple methods and GNNs.
We also propose a novel metric to quantify the dataset effectiveness by considering both dataset complexity and model performance.
Our findings shed light on the current understanding of benchmark datasets, and our new platform could fuel the future evolution of graph classification benchmarks.
arXiv Detail & Related papers (2024-07-06T08:33:23Z) - Cluster-based Graph Collaborative Filtering [55.929052969825825]
Graph Convolution Networks (GCNs) have succeeded in learning user and item representations for recommendation systems.
Most existing GCN-based methods overlook the multiple interests of users while performing high-order graph convolution.
We propose a novel GCN-based recommendation model, termed Cluster-based Graph Collaborative Filtering (ClusterGCF)
arXiv Detail & Related papers (2024-04-16T07:05:16Z) - Redundancy-Free Self-Supervised Relational Learning for Graph Clustering [13.176413653235311]
We propose a novel self-supervised deep graph clustering method named Redundancy-Free Graph Clustering (R$2$FGC)
It extracts the attribute- and structure-level relational information from both global and local views based on an autoencoder and a graph autoencoder.
Our experiments are performed on widely used benchmark datasets to validate the superiority of our R$2$FGC over state-of-the-art baselines.
arXiv Detail & Related papers (2023-09-09T06:18:50Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z) - Graph Clustering with Graph Neural Networks [5.305362965553278]
Graph Neural Networks (GNNs) have achieved state-of-the-art results on many graph analysis tasks.
Unsupervised problems on graphs, such as graph clustering, have proved more resistant to advances in GNNs.
We introduce Deep Modularity Networks (DMoN), an unsupervised pooling method inspired by the modularity measure of clustering quality.
arXiv Detail & Related papers (2020-06-30T15:30:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.