Robust Correlation Clustering with Asymmetric Noise
- URL: http://arxiv.org/abs/2110.08385v1
- Date: Fri, 15 Oct 2021 21:47:27 GMT
- Title: Robust Correlation Clustering with Asymmetric Noise
- Authors: Jimit Majmudar, Stephen Vavasis
- Abstract summary: Correlation Clustering is a graph clustering formulation which: (1) takes as input a signed graph with edge weights representing a similarity/dissimilarity measure between the nodes, and (2) requires no prior estimate of the number of clusters in the input graph.
We propose a novel graph generative model, called the Node Factors Model (NFM), which is based on generating feature vectors/embeddings for the graph nodes.
- Score: 3.8073142980733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph clustering problems typically aim to partition the graph nodes such
that two nodes belong to the same partition set if and only if they are
similar. Correlation Clustering is a graph clustering formulation which: (1)
takes as input a signed graph with edge weights representing a
similarity/dissimilarity measure between the nodes, and (2) requires no prior
estimate of the number of clusters in the input graph. However, the
combinatorial optimization problem underlying Correlation Clustering is
NP-hard. In this work, we propose a novel graph generative model, called the
Node Factors Model (NFM), which is based on generating feature
vectors/embeddings for the graph nodes. The graphs generated by the NFM contain
asymmetric noise in the sense that there may exist pairs of nodes in the same
cluster which are negatively correlated. We propose a novel Correlation
Clustering algorithm, called \anormd, using techniques from semidefinite
programming. Using a combination of theoretical and computational results, we
demonstrate that $\texttt{$\ell_2$-norm-diag}$ recovers nodes with sufficiently
strong cluster membership in graph instances generated by the NFM, thereby
making progress towards establishing the provable robustness of our proposed
algorithm.
Related papers
- Multiscale Graph Construction Using Non-local Cluster Features [10.922757310575307]
We consider graph and node-wise features simultaneously for multiscale clustering of a graph.
In experiments on multiscale image and point cloud segmentation, we demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-11-13T06:42:03Z) - HeNCler: Node Clustering in Heterophilous Graphs through Learned Asymmetric Similarity [55.27586970082595]
HeNCler is a novel approach for Heterophilous Node Clustering.
We show that HeNCler significantly enhances performance in node clustering tasks within heterophilous graph contexts.
arXiv Detail & Related papers (2024-05-27T11:04:05Z) - Cluster-based Graph Collaborative Filtering [55.929052969825825]
Graph Convolution Networks (GCNs) have succeeded in learning user and item representations for recommendation systems.
Most existing GCN-based methods overlook the multiple interests of users while performing high-order graph convolution.
We propose a novel GCN-based recommendation model, termed Cluster-based Graph Collaborative Filtering (ClusterGCF)
arXiv Detail & Related papers (2024-04-16T07:05:16Z) - MeanCut: A Greedy-Optimized Graph Clustering via Path-based Similarity
and Degree Descent Criterion [0.6906005491572401]
spectral clustering is popular and attractive due to the remarkable performance, easy implementation, and strong adaptability.
We propose MeanCut as the objective function and greedily optimize it in degree descending order for a nondestructive graph partition.
The validity of our algorithm is demonstrated by testifying on real-world benchmarks and application of face recognition.
arXiv Detail & Related papers (2023-12-07T06:19:39Z) - Latent Random Steps as Relaxations of Max-Cut, Min-Cut, and More [30.919536115917726]
We present a probabilistic model based on non-negative matrix factorization which unifies clustering and simplification.
By relaxing the hard clustering to a soft clustering, our algorithm relaxes potentially hard clustering problems to a tractable ones.
arXiv Detail & Related papers (2023-08-12T02:47:57Z) - Graph Mixup with Soft Alignments [49.61520432554505]
We study graph data augmentation by mixup, which has been used successfully on images.
We propose S-Mixup, a simple yet effective mixup method for graph classification by soft alignments.
arXiv Detail & Related papers (2023-06-11T22:04:28Z) - Dink-Net: Neural Clustering on Large Graphs [59.10189693120368]
A deep graph clustering method (Dink-Net) is proposed with the idea of dilation and shrink.
By discriminating nodes, whether being corrupted by augmentations, representations are learned in a self-supervised manner.
The clustering distribution is optimized by minimizing the proposed cluster dilation loss and cluster shrink loss.
Compared to the runner-up, Dink-Net 9.62% achieves NMI improvement on the ogbn-papers100M dataset with 111 million nodes and 1.6 billion edges.
arXiv Detail & Related papers (2023-05-28T15:33:24Z) - AnchorGAE: General Data Clustering via $O(n)$ Bipartite Graph
Convolution [79.44066256794187]
We show how to convert a non-graph dataset into a graph by introducing the generative graph model, which is used to build graph convolution networks (GCNs)
A bipartite graph constructed by anchors is updated dynamically to exploit the high-level information behind data.
We theoretically prove that the simple update will lead to degeneration and a specific strategy is accordingly designed.
arXiv Detail & Related papers (2021-11-12T07:08:13Z) - Explicit Pairwise Factorized Graph Neural Network for Semi-Supervised
Node Classification [59.06717774425588]
We propose the Explicit Pairwise Factorized Graph Neural Network (EPFGNN), which models the whole graph as a partially observed Markov Random Field.
It contains explicit pairwise factors to model output-output relations and uses a GNN backbone to model input-output relations.
We conduct experiments on various datasets, which shows that our model can effectively improve the performance for semi-supervised node classification on graphs.
arXiv Detail & Related papers (2021-07-27T19:47:53Z) - Seeing All From a Few: Nodes Selection Using Graph Pooling for Graph
Clustering [37.68977275752782]
noisy edges and nodes in the graph may make the clustering results worse.
We propose a novel dual graph embedding network(DGEN) to improve the robustness of the graph clustering to the noisy nodes and edges.
Experiments on three benchmark graph datasets demonstrate the superiority compared with several state-of-the-art algorithms.
arXiv Detail & Related papers (2021-04-30T06:51:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.