Measuring inter-cluster similarities with Alpha Shape TRIangulation in
loCal Subspaces (ASTRICS) facilitates visualization and clustering of
high-dimensional data
- URL: http://arxiv.org/abs/2107.07603v1
- Date: Thu, 15 Jul 2021 20:51:06 GMT
- Title: Measuring inter-cluster similarities with Alpha Shape TRIangulation in
loCal Subspaces (ASTRICS) facilitates visualization and clustering of
high-dimensional data
- Authors: Joshua M. Scurll
- Abstract summary: Clustering and visualizing high-dimensional (HD) data are important tasks in a variety of fields.
Some of the most effective algorithms for clustering HD data are based on representing the data by nodes in a graph.
I propose a new method called ASTRICS to measure similarity between clusters of HD data points.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clustering and visualizing high-dimensional (HD) data are important tasks in
a variety of fields. For example, in bioinformatics, they are crucial for
analyses of single-cell data such as mass cytometry (CyTOF) data. Some of the
most effective algorithms for clustering HD data are based on representing the
data by nodes in a graph, with edges connecting neighbouring nodes according to
some measure of similarity or distance. However, users of graph-based
algorithms are typically faced with the critical but challenging task of
choosing the value of an input parameter that sets the size of neighbourhoods
in the graph, e.g. the number of nearest neighbours to which to connect each
node or a threshold distance for connecting nodes. The burden on the user could
be alleviated by a measure of inter-node similarity that can have value 0 for
dissimilar nodes without requiring any user-defined parameters or thresholds.
This would determine the neighbourhoods automatically while still yielding a
sparse graph. To this end, I propose a new method called ASTRICS to measure
similarity between clusters of HD data points based on local dimensionality
reduction and triangulation of critical alpha shapes. I show that my ASTRICS
similarity measure can facilitate both clustering and visualization of HD data
by using it in Stage 2 of a three-stage pipeline: Stage 1 = perform an initial
clustering of the data by any method; Stage 2 = let graph nodes represent
initial clusters instead of individual data points and use ASTRICS to
automatically define edges between nodes; Stage 3 = use the graph for further
clustering and visualization. This trades the critical task of choosing a graph
neighbourhood size for the easier task of essentially choosing a resolution at
which to view the data. The graph and consequently downstream clustering and
visualization are then automatically adapted to the chosen resolution.
Related papers
- Cluster-based Graph Collaborative Filtering [55.929052969825825]
Graph Convolution Networks (GCNs) have succeeded in learning user and item representations for recommendation systems.
Most existing GCN-based methods overlook the multiple interests of users while performing high-order graph convolution.
We propose a novel GCN-based recommendation model, termed Cluster-based Graph Collaborative Filtering (ClusterGCF)
arXiv Detail & Related papers (2024-04-16T07:05:16Z) - Graph Transformer GANs with Graph Masked Modeling for Architectural
Layout Generation [153.92387500677023]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations.
The proposed graph Transformer encoder combines graph convolutions and self-attentions in a Transformer to model both local and global interactions.
We also propose a novel self-guided pre-training method for graph representation learning.
arXiv Detail & Related papers (2024-01-15T14:36:38Z) - GrannGAN: Graph annotation generative adversarial networks [72.66289932625742]
We consider the problem of modelling high-dimensional distributions and generating new examples of data with complex relational feature structure coherent with a graph skeleton.
The model we propose tackles the problem of generating the data features constrained by the specific graph structure of each data point by splitting the task into two phases.
In the first it models the distribution of features associated with the nodes of the given graph, in the second it complements the edge features conditionally on the node features.
arXiv Detail & Related papers (2022-12-01T11:49:07Z) - Learning Optimal Graph Filters for Clustering of Attributed Graphs [20.810096547938166]
Many real-world systems can be represented as graphs where the different entities in the system are presented by nodes and their interactions by edges.
An important task in studying large datasets with graphical structure is graph clustering.
We introduce a graph signal processing based approach, where we learn the parameters of Finite Impulse Response (FIR) and Autoregressive Moving Average (ARMA) graph filters optimized for clustering.
arXiv Detail & Related papers (2022-11-09T01:49:23Z) - Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC)
In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning.
For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z) - Clustering Plotted Data by Image Segmentation [12.443102864446223]
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data.
In this paper, we present a wholly different way of clustering points in 2-dimensional space, inspired by how humans cluster data.
Our approach, Visual Clustering, has several advantages over traditional clustering algorithms.
arXiv Detail & Related papers (2021-10-06T06:19:30Z) - Seeing All From a Few: Nodes Selection Using Graph Pooling for Graph
Clustering [37.68977275752782]
noisy edges and nodes in the graph may make the clustering results worse.
We propose a novel dual graph embedding network(DGEN) to improve the robustness of the graph clustering to the noisy nodes and edges.
Experiments on three benchmark graph datasets demonstrate the superiority compared with several state-of-the-art algorithms.
arXiv Detail & Related papers (2021-04-30T06:51:51Z) - Spatial-Spectral Clustering with Anchor Graph for Hyperspectral Image [88.60285937702304]
This paper proposes a novel unsupervised approach called spatial-spectral clustering with anchor graph (SSCAG) for HSI data clustering.
The proposed SSCAG is competitive against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-04-24T08:09:27Z) - Graph InfoClust: Leveraging cluster-level node information for
unsupervised graph representation learning [12.592903558338444]
We propose a graph representation learning method called Graph InfoClust.
It seeks to additionally capture cluster-level information content.
This optimization leads the node representations to capture richer information and nodal interactions, which improves their quality.
arXiv Detail & Related papers (2020-09-15T09:33:20Z) - GPS-Net: Graph Property Sensing Network for Scene Graph Generation [91.60326359082408]
Scene graph generation (SGG) aims to detect objects in an image along with their pairwise relationships.
GPS-Net fully explores three properties for SGG: edge direction information, the difference in priority between nodes, and the long-tailed distribution of relationships.
GPS-Net achieves state-of-the-art performance on three popular databases: VG, OI, and VRD by significant gains under various settings and metrics.
arXiv Detail & Related papers (2020-03-29T07:22:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.