AnchorGAE: General Data Clustering via $O(n)$ Bipartite Graph
Convolution
- URL: http://arxiv.org/abs/2111.06586v1
- Date: Fri, 12 Nov 2021 07:08:13 GMT
- Title: AnchorGAE: General Data Clustering via $O(n)$ Bipartite Graph
Convolution
- Authors: Hongyuan Zhang, Jiankun Shi, Rui Zhang, Xuelong Li
- Abstract summary: We show how to convert a non-graph dataset into a graph by introducing the generative graph model, which is used to build graph convolution networks (GCNs)
A bipartite graph constructed by anchors is updated dynamically to exploit the high-level information behind data.
We theoretically prove that the simple update will lead to degeneration and a specific strategy is accordingly designed.
- Score: 79.44066256794187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph-based clustering plays an important role in clustering tasks. As graph
convolution network (GCN), a variant of neural networks on graph-type data, has
achieved impressive performance, it is attractive to find whether GCNs can be
used to augment the graph-based clustering methods on non-graph data, i.e.,
general data. However, given $n$ samples, the graph-based clustering methods
usually need at least $O(n^2)$ time to build graphs and the graph convolution
requires nearly $O(n^2)$ for a dense graph and $O(|\mathcal{E}|)$ for a sparse
one with $|\mathcal{E}|$ edges. In other words, both graph-based clustering and
GCNs suffer from severe inefficiency problems. To tackle this problem and
further employ GCN to promote the capacity of graph-based clustering, we
propose a novel clustering method, AnchorGAE. As the graph structure is not
provided in general clustering scenarios, we first show how to convert a
non-graph dataset into a graph by introducing the generative graph model, which
is used to build GCNs. Anchors are generated from the original data to
construct a bipartite graph such that the computational complexity of graph
convolution is reduced from $O(n^2)$ and $O(|\mathcal{E}|)$ to $O(n)$. The
succeeding steps for clustering can be easily designed as $O(n)$ operations.
Interestingly, the anchors naturally lead to a siamese GCN architecture. The
bipartite graph constructed by anchors is updated dynamically to exploit the
high-level information behind data. Eventually, we theoretically prove that the
simple update will lead to degeneration and a specific strategy is accordingly
designed.
Related papers
- Cayley Graph Propagation [0.0]
We show that truncation is detrimental to the expansion properties of Cayley graph structures.
Instead, we propose CGP, a method to propagate information over a complete Cayley graph structure.
arXiv Detail & Related papers (2024-10-04T13:32:34Z) - Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs)
This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings.
Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z) - Graph Mixup with Soft Alignments [49.61520432554505]
We study graph data augmentation by mixup, which has been used successfully on images.
We propose S-Mixup, a simple yet effective mixup method for graph classification by soft alignments.
arXiv Detail & Related papers (2023-06-11T22:04:28Z) - GC-Flow: A Graph-Based Flow Network for Effective Clustering [10.354035049272095]
Graph convolutional networks (GCNs) are emphdiscriminative models that directly model the class posterior $p(y|mathbfx)$ for semi-supervised classification of graph data.
In this work, we design normalizing flows that replace GCN layers, leading to a emphgenerative model that models both the class conditional likelihood $p(mathbfx|y)$ and the class prior $p(y)$.
The resulting neural network, GC-Flow, retains the graph convolution operations while being equipped with
arXiv Detail & Related papers (2023-05-26T22:11:38Z) - Beyond Homophily: Reconstructing Structure for Graph-agnostic Clustering [15.764819403555512]
It is impossible to first identify a graph as homophilic or heterophilic before a suitable GNN model can be found.
We propose a novel graph clustering method, which contains three key components: graph reconstruction, a mixed filter, and dual graph clustering network.
Our method dominates others on heterophilic graphs.
arXiv Detail & Related papers (2023-05-03T01:49:01Z) - G-Mixup: Graph Data Augmentation for Graph Classification [55.63157775049443]
Mixup has shown superiority in improving the generalization and robustness of neural networks by interpolating features and labels between two random samples.
We propose $mathcalG$-Mixup to augment graphs for graph classification by interpolating the generator (i.e., graphon) of different classes of graphs.
Experiments show that $mathcalG$-Mixup substantially improves the generalization and robustness of GNNs.
arXiv Detail & Related papers (2022-02-15T04:09:44Z) - Imbalanced Graph Classification via Graph-of-Graph Neural Networks [16.589373163769853]
Graph Neural Networks (GNNs) have achieved unprecedented success in learning graph representations to identify categorical labels of graphs.
We introduce a novel framework, Graph-of-Graph Neural Networks (G$2$GNN), which alleviates the graph imbalance issue by deriving extra supervision globally from neighboring graphs and locally from graphs themselves.
Our proposed G$2$GNN outperforms numerous baselines by roughly 5% in both F1-macro and F1-micro scores.
arXiv Detail & Related papers (2021-12-01T02:25:47Z) - Graph Pooling with Node Proximity for Hierarchical Representation
Learning [80.62181998314547]
We propose a novel graph pooling strategy that leverages node proximity to improve the hierarchical representation learning of graph data with their multi-hop topology.
Results show that the proposed graph pooling strategy is able to achieve state-of-the-art performance on a collection of public graph classification benchmark datasets.
arXiv Detail & Related papers (2020-06-19T13:09:44Z) - Adaptive Graph Auto-Encoder for General Data Clustering [90.8576971748142]
Graph-based clustering plays an important role in the clustering area.
Recent studies about graph convolution neural networks have achieved impressive success on graph type data.
We propose a graph auto-encoder for general data clustering, which constructs the graph adaptively according to the generative perspective of graphs.
arXiv Detail & Related papers (2020-02-20T10:11:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.