Spectral clustering in the Gaussian mixture block model
- URL: http://arxiv.org/abs/2305.00979v3
- Date: Wed, 10 Apr 2024 23:06:38 GMT
- Title: Spectral clustering in the Gaussian mixture block model
- Authors: Shuangping Li, Tselil Schramm,
- Abstract summary: We study the study of clustering and embedding graphs sampled from high-dimensional Gaussian mixture block models.
We analyze the performance of canonical spectral clustering and embedding algorithms for such graphs.
- Score: 2.668787455520979
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gaussian mixture block models are distributions over graphs that strive to model modern networks: to generate a graph from such a model, we associate each vertex $i$ with a latent feature vector $u_i \in \mathbb{R}^d$ sampled from a mixture of Gaussians, and we add edge $(i,j)$ if and only if the feature vectors are sufficiently similar, in that $\langle u_i,u_j \rangle \ge \tau$ for a pre-specified threshold $\tau$. The different components of the Gaussian mixture represent the fact that there may be different types of nodes with different distributions over features -- for example, in a social network each component represents the different attributes of a distinct community. Natural algorithmic tasks associated with these networks are embedding (recovering the latent feature vectors) and clustering (grouping nodes by their mixture component). In this paper we initiate the study of clustering and embedding graphs sampled from high-dimensional Gaussian mixture block models, where the dimension of the latent feature vectors $d\to \infty$ as the size of the network $n \to \infty$. This high-dimensional setting is most appropriate in the context of modern networks, in which we think of the latent feature space as being high-dimensional. We analyze the performance of canonical spectral clustering and embedding algorithms for such graphs in the case of 2-component spherical Gaussian mixtures, and begin to sketch out the information-computation landscape for clustering and embedding in these models.
Related papers
- An Ad-hoc graph node vector embedding algorithm for general knowledge graphs using Kinetica-Graph [0.0]
This paper discusses how to generate general graph node embeddings from knowledge graph representations.
The embedded space is composed of a number of sub-features to mimic both local affinity and remote structural relevance.
arXiv Detail & Related papers (2024-07-22T14:43:10Z) - GeoMix: Towards Geometry-Aware Data Augmentation [76.09914619612812]
Mixup has shown considerable success in mitigating the challenges posed by limited labeled data in image classification.
We propose Geometric Mixup (GeoMix), a simple and interpretable Mixup approach leveraging in-place graph editing.
arXiv Detail & Related papers (2024-07-15T12:58:04Z) - Information-Theoretic Thresholds for Planted Dense Cycles [52.076657911275525]
We study a random graph model for small-world networks which are ubiquitous in social and biological sciences.
For both detection and recovery of the planted dense cycle, we characterize the information-theoretic thresholds in terms of $n$, $tau$, and an edge-wise signal-to-noise ratio $lambda$.
arXiv Detail & Related papers (2024-02-01T03:39:01Z) - Spectral Clustering of Attributed Multi-relational Graphs [11.486261673963392]
Graph clustering aims at discovering a natural grouping of the nodes such as similar nodes are assigned to a common cluster.
We propose SpectralMix, a joint dimensionality reduction technique for multi-relational graphs with categorical node attributes.
arXiv Detail & Related papers (2023-11-03T11:05:29Z) - On the Equivalence of Graph Convolution and Mixup [71.8932383179048]
This paper investigates the relationship between graph convolution and Mixup techniques.
Under two mild conditions, graph convolution can be viewed as a specialized form of Mixup.
We establish this equivalence mathematically by demonstrating that graph convolution networks (GCN) and simplified graph convolution (SGC) can be expressed as a form of Mixup.
arXiv Detail & Related papers (2023-09-29T23:09:54Z) - GrannGAN: Graph annotation generative adversarial networks [72.66289932625742]
We consider the problem of modelling high-dimensional distributions and generating new examples of data with complex relational feature structure coherent with a graph skeleton.
The model we propose tackles the problem of generating the data features constrained by the specific graph structure of each data point by splitting the task into two phases.
In the first it models the distribution of features associated with the nodes of the given graph, in the second it complements the edge features conditionally on the node features.
arXiv Detail & Related papers (2022-12-01T11:49:07Z) - flow-based clustering and spectral clustering: a comparison [0.688204255655161]
We study a novel graph clustering method for data with an intrinsic network structure.
We exploit an intrinsic network structure of data to construct Euclidean feature vectors.
Our results indicate that our clustering methods can cope with certain graph structures.
arXiv Detail & Related papers (2022-06-20T21:49:52Z) - Gaussian mixture model on nodes of Bayesian network given maximal
parental cliques [0.0]
We will explain why and how we use Gaussian mixture models in network.
We propose a new method, called double iteration algorithm, to optimize the mixture model.
arXiv Detail & Related papers (2022-04-20T15:14:01Z) - Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations.
We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z) - Scaling Graph Clustering with Distributed Sketches [1.1011268090482575]
We present a method inspired by spectral clustering where we instead use matrix sketches derived from random dimension-reducing projections.
We show that our method produces embeddings that yield performant clustering results given a fully-dynamic block model stream.
We also discuss the effects of block model parameters upon the required dimensionality of the subsequent embeddings, and show how random projections could significantly improve the performance of graph clustering in distributed memory.
arXiv Detail & Related papers (2020-07-24T17:38:04Z) - Graph Neural Networks with Composite Kernels [60.81504431653264]
We re-interpret node aggregation from the perspective of kernel weighting.
We present a framework to consider feature similarity in an aggregation scheme.
We propose feature aggregation as the composition of the original neighbor-based kernel and a learnable kernel to encode feature similarities in a feature space.
arXiv Detail & Related papers (2020-05-16T04:44:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.