Biclustering a dataset using photonic quantum computing
- URL: http://arxiv.org/abs/2405.18622v1
- Date: Tue, 28 May 2024 22:04:29 GMT
- Title: Biclustering a dataset using photonic quantum computing
- Authors: Ajinkya Borle, Ameya Bhave,
- Abstract summary: Biclustering is a problem in machine learning and data mining.
We highlight the natural relation that quantum computing models like boson and GBS have to this problem.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Biclustering is a problem in machine learning and data mining that seeks to group together rows and columns of a dataset according to certain criteria. In this work, we highlight the natural relation that quantum computing models like boson and Gaussian boson sampling (GBS) have to this problem. We first explore the use of boson sampling to identify biclusters based on matrix permanents. We then propose a heuristic that finds clusters in a dataset using Gaussian boson sampling by (i) converting the dataset into a bipartite graph and then (ii) running GBS to find the densest sub-graph(s) within the larger bipartite graph. Our simulations for the above proposed heuristics show promising results for future exploration in this area.
Related papers
- HBIC: A Biclustering Algorithm for Heterogeneous Datasets [0.0]
Biclustering is an unsupervised machine-learning approach aiming to cluster rows and columns simultaneously in a data matrix.
We introduce a biclustering approach called HBIC, capable of discovering meaningful biclusters in complex heterogeneous data.
arXiv Detail & Related papers (2024-08-23T16:48:10Z) - ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining [104.34751911174196]
We build a large-scale dataset of 3DGS using ShapeNet and ModelNet datasets.
Our dataset ShapeSplat consists of 65K objects from 87 unique categories.
We introduce textbftextitGaussian-MAE, which highlights the unique benefits of representation learning from Gaussian parameters.
arXiv Detail & Related papers (2024-08-20T14:49:14Z) - Superclustering by finding statistically significant separable groups of
optimal gaussian clusters [0.0]
The paper presents the algorithm for clustering a dataset by grouping the optimal, from the point of view of the BIC criterion.
An essential advantage of the algorithm is its ability to predict correct supercluster for new data based on already trained clusterer.
arXiv Detail & Related papers (2023-09-05T23:49:46Z) - Boost clustering with Gaussian Boson Sampling: a full quantum approach [0.09437521840642138]
We propose a novel clustering approach based on Gaussian Boson Sampling (GBS)
We benchmark our approach with two well-known classical clustering algorithms.
Results show that our approach outperforms the two classical algorithms in two out of the three chosen metrics.
arXiv Detail & Related papers (2023-07-25T09:05:24Z) - Spatio-Temporal Surrogates for Interaction of a Jet with High
Explosives: Part II -- Clustering Extremely High-Dimensional Grid-Based Data [0.0]
In this report, we consider output data from simulations of a jet interacting with high explosives.
We show how we can use the randomness of both the random projections, and the choice of initial centroids in k-means clustering, to determine the number of clusters in our data set.
arXiv Detail & Related papers (2023-07-03T23:36:43Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Sample efficient graph classification using binary Gaussian boson
sampling [0.0]
We present a variation of a quantum algorithm for the machine learning task of classification with graph-structured data.
Our setup only requires binary (light/no light) detectors, as opposed to photon number resolving detectors.
arXiv Detail & Related papers (2023-01-03T17:23:43Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Learning Spatial Context with Graph Neural Network for Multi-Person Pose
Grouping [71.59494156155309]
Bottom-up approaches for image-based multi-person pose estimation consist of two stages: keypoint detection and grouping.
In this work, we formulate the grouping task as a graph partitioning problem, where we learn the affinity matrix with a Graph Neural Network (GNN)
The learned geometry-based affinity is further fused with appearance-based affinity to achieve robust keypoint association.
arXiv Detail & Related papers (2021-04-06T09:21:14Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Clustering Binary Data by Application of Combinatorial Optimization
Heuristics [52.77024349608834]
We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters.
Five new and original methods are introduced, using neighborhoods and population behavior optimization metaheuristics.
From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM.
arXiv Detail & Related papers (2020-01-06T23:33:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.