Robust estimation of the intrinsic dimension of data sets with quantum cognition machine learning
- URL: http://arxiv.org/abs/2409.12805v1
- Date: Thu, 19 Sep 2024 14:24:35 GMT
- Title: Robust estimation of the intrinsic dimension of data sets with quantum cognition machine learning
- Authors: Luca Candelori, Alexander G. Abanov, Jeffrey Berger, Cameron J. Hogan, Vahagn Kirakosyan, Kharen Musaelian, Ryan Samson, James E. T. Smith, Dario Villani, Martin T. Wells, Mengjia Xu,
- Abstract summary: We propose a new data representation method based on Quantum Cognition Machine Learning and apply it to manifold learning.
We learn a representation of each data point as a quantum state, encoding both local properties of the point as well as its relation with the entire data.
Inspired by ideas from quantum geometry, we then construct from the quantum states a point cloud equipped with a quantum metric.
The proposed estimator is based on the detection of this spectral gap.
- Score: 31.347602507204847
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose a new data representation method based on Quantum Cognition Machine Learning and apply it to manifold learning, specifically to the estimation of intrinsic dimension of data sets. The idea is to learn a representation of each data point as a quantum state, encoding both local properties of the point as well as its relation with the entire data. Inspired by ideas from quantum geometry, we then construct from the quantum states a point cloud equipped with a quantum metric. The metric exhibits a spectral gap whose location corresponds to the intrinsic dimension of the data. The proposed estimator is based on the detection of this spectral gap. When tested on synthetic manifold benchmarks, our estimates are shown to be robust with respect to the introduction of point-wise Gaussian noise. This is in contrast to current state-of-the-art estimators, which tend to attribute artificial ``shadow dimensions'' to noise artifacts, leading to overestimates. This is a significant advantage when dealing with real data sets, which are inevitably affected by unknown levels of noise. We show the applicability and robustness of our method on real data, by testing it on the ISOMAP face database, MNIST, and the Wisconsin Breast Cancer Dataset.
Related papers
- Quantum Circuits, Feature Maps, and Expanded Pseudo-Entropy: A Categorical Theoretic Analysis of Encoding Real-World Data into a Quantum Computer [0.0]
The aim of this paper is to determine the efficacy of an encoding scheme to map real-world data into a quantum circuit.
The method calculates the Shannon entropy of each of the data points from a point-cloud, hence, samples from an embedded manifold.
arXiv Detail & Related papers (2024-10-29T14:38:01Z) - Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - Neural FIM for learning Fisher Information Metrics from point cloud data [71.07939200676199]
We propose neural FIM, a method for computing the Fisher information metric (FIM) from point cloud data.
We demonstrate its utility in selecting parameters for the PHATE visualization method as well as its ability to obtain information pertaining to local volume illuminating branching points and cluster centers embeddings of a toy dataset and two single-cell datasets of IPSC reprogramming and PBMCs (immune cells)
arXiv Detail & Related papers (2023-06-01T17:36:13Z) - Randomized Quantization: A Generic Augmentation for Data Agnostic
Self-supervised Learning [89.00646449740606]
Self-supervised representation learning follows a paradigm of withholding some part of the data and tasking the network to predict it from the remaining part.
Data augmentation lies at the core for creating the information gap.
In this paper, we explore the channel dimension for generic data augmentation by exploiting precision redundancy.
arXiv Detail & Related papers (2022-12-19T18:59:57Z) - A didactic approach to quantum machine learning with a single qubit [68.8204255655161]
We focus on the case of learning with a single qubit, using data re-uploading techniques.
We implement the different proposed formulations in toy and real-world datasets using the qiskit quantum computing SDK.
arXiv Detail & Related papers (2022-11-23T18:25:32Z) - Parametric t-Stochastic Neighbor Embedding With Quantum Neural Network [0.6946929968559495]
t-Stochastic Neighbor Embedding (t-SNE) is a non-parametric data visualization method in classical machine learning.
We propose to use quantum neural networks for parametric t-SNE to reflect the characteristics of high-dimensional quantum data on low-dimensional data.
arXiv Detail & Related papers (2022-02-09T02:49:54Z) - Tree tensor network classifiers for machine learning: from
quantum-inspired to quantum-assisted [0.0]
We describe a quantum-assisted machine learning (QAML) method in which multivariate data is encoded into quantum states in a Hilbert space whose dimension is exponentially large in the length of the data vector.
We present an approach that can be implemented on gate-based quantum computing devices.
arXiv Detail & Related papers (2021-04-06T02:31:48Z) - Nearest Centroid Classification on a Trapped Ion Quantum Computer [57.5195654107363]
We design a quantum Nearest Centroid classifier, using techniques for efficiently loading classical data into quantum states and performing distance estimations.
We experimentally demonstrate it on a 11-qubit trapped-ion quantum machine, matching the accuracy of classical nearest centroid classifiers for the MNIST handwritten digits dataset and achieving up to 100% accuracy for 8-dimensional synthetic data.
arXiv Detail & Related papers (2020-12-08T01:10:30Z) - Representation Learning for Sequence Data with Deep Autoencoding
Predictive Components [96.42805872177067]
We propose a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space.
We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step.
We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.
arXiv Detail & Related papers (2020-10-07T03:34:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.