Data Clustering as an Emergent Consensus of Autonomous Agents
- URL: http://arxiv.org/abs/2204.10585v1
- Date: Fri, 22 Apr 2022 09:11:35 GMT
- Title: Data Clustering as an Emergent Consensus of Autonomous Agents
- Authors: Piotr Minakowski and Jan Peszek
- Abstract summary: We present a data segmentation method based on a first-order density-induced consensus protocol.
We provide a mathematically rigorous analysis of the consensus model leading to the stopping criteria of the data segmentation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a data segmentation method based on a first-order density-induced
consensus protocol. We provide a mathematically rigorous analysis of the
consensus model leading to the stopping criteria of the data segmentation
algorithm. To illustrate our method, the algorithm is applied to
two-dimensional shape datasets and selected images from Berkeley Segmentation
Dataset. The method can be seen as an augmentation of classical clustering
techniques for multimodal feature space, such as DBSCAN. It showcases a curious
connection between data clustering and collective behavior.
Related papers
- A Deterministic Information Bottleneck Method for Clustering Mixed-Type Data [0.0]
We present an information-theoretic method for clustering mixed-type data, that is, data consisting of both continuous and categorical variables.
The method is a variant of the Deterministic Information Bottleneck algorithm which optimally compresses the data while retaining relevant information about the underlying structure.
arXiv Detail & Related papers (2024-07-03T09:06:19Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Consistency between ordering and clustering methods for graphs [0.8594140167290096]
We investigate methodological relationships between several clustering and ordering methods.
We propose a measure called the label continuity error, which generically quantifies the degree of consistency between a sequence and partition.
Based on synthetic and real-world datasets, we evaluate the extents to which an ordering method identifies a module structure.
arXiv Detail & Related papers (2022-08-27T05:55:26Z) - Clustering Optimisation Method for Highly Connected Biological Data [0.0]
We show how a simple metric for connectivity clustering evaluation leads to an optimised segmentation of biological data.
The novelty of the work resides in the creation of a simple optimisation method for clustering crowded data.
arXiv Detail & Related papers (2022-08-08T17:33:32Z) - Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality.
The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z) - Clustering Plotted Data by Image Segmentation [12.443102864446223]
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data.
In this paper, we present a wholly different way of clustering points in 2-dimensional space, inspired by how humans cluster data.
Our approach, Visual Clustering, has several advantages over traditional clustering algorithms.
arXiv Detail & Related papers (2021-10-06T06:19:30Z) - Weighted Sparse Subspace Representation: A Unified Framework for
Subspace Clustering, Constrained Clustering, and Active Learning [0.3553493344868413]
We first propose a novel spectral-based subspace clustering algorithm that seeks to represent each point as a sparse convex combination of a few nearby points.
We then extend the algorithm to constrained clustering and active learning settings.
Our motivation for developing such a framework stems from the fact that typically either a small amount of labelled data is available in advance; or it is possible to label some points at a cost.
arXiv Detail & Related papers (2021-06-08T13:39:43Z) - Determinantal consensus clustering [77.34726150561087]
We propose the use of determinantal point processes or DPP for the random restart of clustering algorithms.
DPPs favor diversity of the center points within subsets.
We show through simulations that, contrary to DPP, this technique fails both to ensure diversity, and to obtain a good coverage of all data facets.
arXiv Detail & Related papers (2021-02-07T23:48:24Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Conjoined Dirichlet Process [63.89763375457853]
We develop a novel, non-parametric probabilistic biclustering method based on Dirichlet processes to identify biclusters with strong co-occurrence in both rows and columns.
We apply our method to two different applications, text mining and gene expression analysis, and demonstrate that our method improves bicluster extraction in many settings compared to existing approaches.
arXiv Detail & Related papers (2020-02-08T19:41:23Z) - Clustering Binary Data by Application of Combinatorial Optimization
Heuristics [52.77024349608834]
We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters.
Five new and original methods are introduced, using neighborhoods and population behavior optimization metaheuristics.
From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM.
arXiv Detail & Related papers (2020-01-06T23:33:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.