Related papers: Clustering with Tangles: Algorithmic Framework and Theoretical Guarantees

Clustering with Tangles: Algorithmic Framework and Theoretical Guarantees

URL: http://arxiv.org/abs/2006.14444v3
Date: Sun, 6 Nov 2022 22:11:38 GMT
Title: Clustering with Tangles: Algorithmic Framework and Theoretical Guarantees
Authors: Solveig Klepper, Christian Elbracht, Diego Fioravanti, Jakob Kneip, Luca Rendsburg, Maximilian Teegen, Ulrike von Luxburg
Abstract summary: In this paper, we showcase the practical potential of tangles in machine learning applications. Given a collection of cuts of any dataset, tangles aggregate these cuts to point in the direction of a dense structure. We construct the algorithmic framework for clustering with tangles, prove theoretical guarantees in various settings, and provide extensive simulations and use cases.
Score: 10.992467680364962
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Originally, tangles were invented as an abstract tool in mathematical graph theory to prove the famous graph minor theorem. In this paper, we showcase the practical potential of tangles in machine learning applications. Given a collection of cuts of any dataset, tangles aggregate these cuts to point in the direction of a dense structure. As a result, a cluster is softly characterized by a set of consistent pointers. This highly flexible approach can solve clustering problems in various setups, ranging from questionnaires over community detection in graphs to clustering points in metric spaces. The output of our proposed framework is hierarchical and induces the notion of a soft dendrogram, which can help explore the cluster structure of a dataset. The computational complexity of aggregating the cuts is linear in the number of data points. Thus the bottleneck of the tangle approach is to generate the cuts, for which simple and fast algorithms form a sufficient basis. In our paper we construct the algorithmic framework for clustering with tangles, prove theoretical guarantees in various settings, and provide extensive simulations and use cases. Python code is available on github.

Related papers

Improving embedding of graphs with missing data by soft manifolds [51.425411400683565]
The reliability of graph embeddings depends on how much the geometry of the continuous space matches the graph structure. We introduce a new class of manifold, named soft manifold, that can solve this situation. Using soft manifold for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets.
arXiv Detail & Related papers (2023-11-29T12:48:33Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Latent Random Steps as Relaxations of Max-Cut, Min-Cut, and More [30.919536115917726]
We present a probabilistic model based on non-negative matrix factorization which unifies clustering and simplification. By relaxing the hard clustering to a soft clustering, our algorithm relaxes potentially hard clustering problems to a tractable ones.
arXiv Detail & Related papers (2023-08-12T02:47:57Z)
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes. The efficient computation is enabled by a kernerlized Gumbel-Softmax operator. Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z)
ClusterFuG: Clustering Fully connected Graphs by Multicut [20.254912065749956]
In dense multicut, the clustering objective is given in a factorized form as inner products of node feature vectors. We show how to rewrite classical greedy algorithms for multicut in our dense setting and how to modify them for greater efficiency and solution quality.
arXiv Detail & Related papers (2023-01-28T11:10:50Z)
flow-based clustering and spectral clustering: a comparison [0.688204255655161]
We study a novel graph clustering method for data with an intrinsic network structure. We exploit an intrinsic network structure of data to construct Euclidean feature vectors. Our results indicate that our clustering methods can cope with certain graph structures.
arXiv Detail & Related papers (2022-06-20T21:49:52Z)
Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs [5.301300942803395]
Cut-based directed graph (digraph) clustering often focuses on finding dense within-cluster or sparse between-cluster connections. For flow-based clusterings the edges between clusters tend to be oriented in one direction and have been found in migration data, food webs, and trade data.
arXiv Detail & Related papers (2022-03-02T20:07:04Z)
Structured Graph Learning for Clustering and Semi-supervised Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data. Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure. Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z)
An Integer Linear Programming Framework for Mining Constraints from Data [81.60135973848125]
We present a general framework for mining constraints from data. In particular, we consider the inference in structured output prediction as an integer linear programming (ILP) problem. We show that our approach can learn to solve 9x9 Sudoku puzzles and minimal spanning tree problems from examples without providing the underlying rules.
arXiv Detail & Related papers (2020-06-18T20:09:53Z)
Graph Neural Networks with Composite Kernels [60.81504431653264]
We re-interpret node aggregation from the perspective of kernel weighting. We present a framework to consider feature similarity in an aggregation scheme. We propose feature aggregation as the composition of the original neighbor-based kernel and a learnable kernel to encode feature similarities in a feature space.
arXiv Detail & Related papers (2020-05-16T04:44:29Z)
Clustering by Constructing Hyper-Planes [0.0]
We present a clustering algorithm by finding hyper-planes to distinguish data points. It relies on the marginal space between the points to determine centers and numbers of clusters. Because the algorithm is based on linear structures, it can approximate the distribution of datasets accurately and flexibly.
arXiv Detail & Related papers (2020-04-25T08:52:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.