Influence of Data Dimensionality Reduction Methods on the Effectiveness of Quantum Machine Learning Models
- URL: http://arxiv.org/abs/2511.03320v1
- Date: Wed, 05 Nov 2025 09:34:12 GMT
- Title: Influence of Data Dimensionality Reduction Methods on the Effectiveness of Quantum Machine Learning Models
- Authors: Aakash Ravindra Shinde, Jukka K. Nurminen,
- Abstract summary: We analyze how data reduction methods affect different quantum machine learning models.<n>Our findings have led us to conclude that the usage of data dimensionality reduction methods results in skewed performance metric values.
- Score: 0.3437656066916039
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data dimensionality reduction techniques are often utilized in the implementation of Quantum Machine Learning models to address two significant issues: the constraints of NISQ quantum devices, which are characterized by noise and a limited number of qubits, and the challenge of simulating a large number of qubits on classical devices. It also raises concerns over the scalability of these approaches, as dimensionality reduction methods are slow to adapt to large datasets. In this article, we analyze how data reduction methods affect different QML models. We conduct this experiment over several generated datasets, quantum machine algorithms, quantum data encoding methods, and data reduction methods. All these models were evaluated on the performance metrics like accuracy, precision, recall, and F1 score. Our findings have led us to conclude that the usage of data dimensionality reduction methods results in skewed performance metric values, which results in wrongly estimating the actual performance of quantum machine learning models. There are several factors, along with data dimensionality reduction methods, that worsen this problem, such as characteristics of the datasets, classical to quantum information embedding methods, percentage of feature reduction, classical components associated with quantum models, and structure of quantum machine learning models. We consistently observed the difference in the accuracy range of 14% to 48% amongst these models, using data reduction and not using it. Apart from this, our observations have shown that some data reduction methods tend to perform better for some specific data embedding methodologies and ansatz constructions.
Related papers
- Multi-Dimensional Visual Data Recovery: Scale-Aware Tensor Modeling and Accelerated Randomized Computation [51.65236537605077]
We propose a new type of network compression optimization technique, fully randomized tensor network compression (FCTN)<n>FCTN has significant advantages in correlation characterization and transpositional in algebra, and has notable achievements in multi-dimensional data processing and analysis.<n>We derive efficient algorithms with guarantees to solve the formulated models.
arXiv Detail & Related papers (2026-02-13T14:56:37Z) - Learning Reduced Representations for Quantum Classifiers [1.4446723310060385]
We apply dimensionality reduction methods to a particle physics data set to train a quantum support vector machine.<n>We show that the autoencoder methods learn a better lower-dimensional representation of the data, with the method we design, the Sinkclass autoencoder, performing 40% better than the baseline.
arXiv Detail & Related papers (2025-12-01T10:34:41Z) - Understanding Data Influence with Differential Approximation [63.817689230826595]
We introduce a new formulation to approximate a sample's influence by accumulating the differences in influence between consecutive learning steps, which we term Diff-In.<n>By employing second-order approximations, we approximate these difference terms with high accuracy while eliminating the need for model convexity required by existing methods.<n>Our theoretical analysis demonstrates that Diff-In achieves significantly lower approximation error compared to existing influence estimators.
arXiv Detail & Related papers (2025-08-20T11:59:32Z) - Evaluating Angle and Amplitude Encoding Strategies for Variational Quantum Machine Learning: their impact on model's accuracy [0.6553587309274792]
Variational Quantum Circuit (VQC) is a hybrid model where the quantum circuit handles data inference while classical optimization adjusts the parameters of the circuit.<n>This work involves performing an analysis by considering both Amplitude- and Angle-encoding models, and examining how the type of rotational gate applied affects the classification performance of the model.<n>The study demonstrates that, under identical model topologies, the difference in accuracy between the best and worst models ranges from 10% to 30%, with differences reaching up to 41%.
arXiv Detail & Related papers (2025-08-01T16:43:45Z) - Enhancing the performance of Variational Quantum Classifiers with hybrid autoencoders [0.0]
We propose an alternative method which reduces the dimensionality of a given dataset by taking into account the specific quantum embedding that comes after.
This method aspires to make quantum machine learning with VQCs more versatile and effective on datasets of high dimension.
arXiv Detail & Related papers (2024-09-05T08:51:20Z) - Learning Density Functionals from Noisy Quantum Data [0.0]
noisy intermediate-scale quantum (NISQ) devices are used to generate training data for machine learning (ML) models.
We show that a neural-network ML model can successfully generalize from small datasets subject to noise typical of NISQ algorithms.
Our findings suggest a promising pathway for leveraging NISQ devices in practical quantum simulations.
arXiv Detail & Related papers (2024-09-04T17:59:55Z) - Transition Role of Entangled Data in Quantum Machine Learning [51.6526011493678]
Entanglement serves as the resource to empower quantum computing.
Recent progress has highlighted its positive impact on learning quantum dynamics.
We establish a quantum no-free-lunch (NFL) theorem for learning quantum dynamics using entangled data.
arXiv Detail & Related papers (2023-06-06T08:06:43Z) - Post-training Model Quantization Using GANs for Synthetic Data
Generation [57.40733249681334]
We investigate the use of synthetic data as a substitute for the calibration with real data for the quantization method.
We compare the performance of models quantized using data generated by StyleGAN2-ADA and our pre-trained DiStyleGAN, with quantization using real data and an alternative data generation method based on fractal images.
arXiv Detail & Related papers (2023-05-10T11:10:09Z) - A didactic approach to quantum machine learning with a single qubit [68.8204255655161]
We focus on the case of learning with a single qubit, using data re-uploading techniques.
We implement the different proposed formulations in toy and real-world datasets using the qiskit quantum computing SDK.
arXiv Detail & Related papers (2022-11-23T18:25:32Z) - Quantum-tailored machine-learning characterization of a superconducting
qubit [50.591267188664666]
We develop an approach to characterize the dynamics of a quantum device and learn device parameters.
This approach outperforms physics-agnostic recurrent neural networks trained on numerically generated and experimental data.
This demonstration shows how leveraging domain knowledge improves the accuracy and efficiency of this characterization task.
arXiv Detail & Related papers (2021-06-24T15:58:57Z) - Provable superior accuracy in machine learned quantum models [2.814412986458045]
We construct dimensionally reduced quantum models by machine learning methods that can achieve greater accuracy than provably optimal classical counterparts.
These techniques illustrate the immediate relevance of quantum technologies to time-series analysis and offer a rare instance where the resulting quantum advantage can be provably established.
arXiv Detail & Related papers (2021-05-30T05:55:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.