The Automatic Quasi-clique Merger algorithm (AQCM)
- URL: http://arxiv.org/abs/2103.04186v1
- Date: Sat, 6 Mar 2021 20:01:59 GMT
- Title: The Automatic Quasi-clique Merger algorithm (AQCM)
- Authors: Scott Payne, Edgar Fuller, George Spirou, Cun-Quan Zhang
- Abstract summary: The Automatic Quasi-clique Merger algorithm is a new algorithm adapted from early work published under the name QCM.
We present the general idea of a quasi-clique agglomerative approach, provide the full details of the mathematical steps of the AQCM algorithm, and explain some of the motivation behind the new methodology.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The Automatic Quasi-clique Merger algorithm is a new algorithm adapted from
early work published under the name QCM (quasi-clique merger) [Ou2006, Ou2007,
Zhao2011, Qi2014]. The AQCM algorithm performs hierarchical clustering in any
data set for which there is an associated similarity measure quantifying the
similarity of any data i and data j. Importantly, the method exhibits two
valuable performance properties: 1) the ability to automatically return either
a larger or smaller number of clusters depending on the inherent properties of
the data rather than on a parameter 2) the ability to return a very large
number of relatively small clusters automatically when such clusters are
reasonably well defined in a data set. In this work we present the general idea
of a quasi-clique agglomerative approach, provide the full details of the
mathematical steps of the AQCM algorithm, and explain some of the motivation
behind the new methodology. The main achievement of the new methodology is that
the agglomerative process now unfolds adaptively according to the inherent
structure unique to a given data set, and this happens without the time-costly
parameter adjustment that drove the previous QCM algorithm. For this reason we
call the new algorithm \emph{automatic}. We provide a demonstration of the
algorithm's performance at the task of community detection in a social media
network of 22,900 nodes.
Related papers
- From Large to Small Datasets: Size Generalization for Clustering
Algorithm Selection [12.993073967843292]
We study a problem in a semi-supervised setting, with an unknown ground-truth clustering.
We introduce a notion of size generalization for clustering algorithm accuracy.
We use a subsample of as little as 5% of the data to identify which algorithm is best on the full dataset.
arXiv Detail & Related papers (2024-02-22T06:53:35Z) - Learning Hidden Markov Models Using Conditional Samples [72.20944611510198]
This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM)
In this paper, we consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs.
Specifically, we obtain efficient algorithms for learning HMMs in settings where we have query access to the exact conditional probabilities.
arXiv Detail & Related papers (2023-02-28T16:53:41Z) - NISQ-friendly measurement-based quantum clustering algorithms [0.7373617024876725]
Two novel measurement-based, quantum clustering algorithms are proposed.
The first algorithm follows a divisive approach, the second is based on unsharp measurements.
Both algorithms are simplistic in nature, easy to implement, and well suited for noisy intermediate scale quantum computers.
arXiv Detail & Related papers (2023-02-01T16:38:27Z) - A Metaheuristic Algorithm for Large Maximum Weight Independent Set
Problems [58.348679046591265]
Given a node-weighted graph, find a set of independent (mutually nonadjacent) nodes whose node-weight sum is maximum.
Some of the graphs airsing in this application are large, having hundreds of thousands of nodes and hundreds of millions of edges.
We develop a new local search algorithm, which is a metaheuristic in the greedy randomized adaptive search framework.
arXiv Detail & Related papers (2022-03-28T21:34:16Z) - An Exact Algorithm for Semi-supervised Minimum Sum-of-Squares Clustering [0.5801044612920815]
We present a new branch-and-bound algorithm for semi-supervised MSSC.
Background knowledge is incorporated as pairwise must-link and cannot-link constraints.
For the first time, the proposed global optimization algorithm efficiently manages to solve real-world instances up to 800 data points.
arXiv Detail & Related papers (2021-11-30T17:08:53Z) - DAAS: Differentiable Architecture and Augmentation Policy Search [107.53318939844422]
This work considers the possible coupling between neural architectures and data augmentation and proposes an effective algorithm jointly searching for them.
Our approach achieves 97.91% accuracy on CIFAR-10 and 76.6% Top-1 accuracy on ImageNet dataset, showing the outstanding performance of our search algorithm.
arXiv Detail & Related papers (2021-09-30T17:15:17Z) - Hierarchical Agglomerative Graph Clustering in Nearly-Linear Time [1.5644420658691407]
We study the widely used hierarchical agglomerative clustering (HAC) algorithm on edge-weighted graphs.
We define an algorithmic framework for hierarchical agglomerative graph clustering.
We show that our approach can speed up clustering of point datasets by a factor of 20.7--76.5x.
arXiv Detail & Related papers (2021-06-10T09:29:05Z) - Finding Geometric Models by Clustering in the Consensus Space [61.65661010039768]
We propose a new algorithm for finding an unknown number of geometric models, e.g., homographies.
We present a number of applications where the use of multiple geometric models improves accuracy.
These include pose estimation from multiple generalized homographies; trajectory estimation of fast-moving objects.
arXiv Detail & Related papers (2021-03-25T14:35:07Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - Relational Algorithms for k-means Clustering [17.552485682328772]
This paper gives a k-means approximation algorithm that is efficient in the relational algorithms model.
The running time is potentially exponentially smaller than $N$, the number of data points to be clustered that the relational database represents.
arXiv Detail & Related papers (2020-08-01T23:21:40Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.